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Polvketides and Their Synthesis 

Technical Field 

The present invention relates to processes and 

5 materials (including enzyme systems, nucleic acids, 

vectors and cultures) which can be used to influence the 
selection of acylthioester units for the synthesis of 
polyketides, and to the resulting polyketides, which may 
be novel. It is particularly concerned with macrolides, 

10 polyethers or polyenes and their preparation making use 
of recombinant synthesis. 

In preferred types of embodiment, polyketide 
biosynthetie genes or portions of them, which may be 
derived from different polyketide biosynthetic gene 

15 clusters, are manipulated to allow the production of 
specific polyketides, such as 12-, 14- and 16-membered 
macrolides, of predicted structure. The invention is 
particularly concerned with the modification of an Acyl 
CoA:ACP transferase (AT) function, generally by modifying 

20 genetic material encoding it in order to prepare 
polyketides with a predetennined ketide unit, e.g. 
incorporating (a) a malonate extender unit; or (b) a 
methylmalonate extender unit; or (c) an ethylmalonate 
extender unit; or (d) a further type of extender unit; or 

25 (e) an acetate and/or malonate starter unit; or (f) a 
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propionate and/or methylmalonate starter tinit; or (g) a 
butyrate and/or ethylmalonate starter unit; or (h) a 
further type of starter unit. Of course the invention can 
be used to influence more than one ketide unit of a 
5 polyketide. The method enables one to minimise 

alteration to the protein structure of the polyketide 
synthase . 

Polyketides are a large and structurally diverse 
class of natural products that includes many compounds 
10 possessing antibiotic or other pharmacological 
properties, such as erythromycin, tetracyclines, 
rapamycin, avermectin, monensin, epothilpne and FK506. 
In particular, polyketides are abundantly produced by 
Streptomyces and related actinomycete bacteria. They are 

15 synthesised by the repeated stepwise condensation of 
acylthioesters in a manner analogous to that of fatty 
acid biosynthesis. The structural diversity found among 
natural polyketides arises in part from the selection of 
(usually) acetate (malonyl-CoA) or propionate 

20 (methylmalonyl-CoA) as "starter" or "extender" units 
(although one of a variety of other types of unit may 
occasionally be selected) ; as well as from the differing 
degree of processing of the ^-keto group formed after 
each condensation. Exatrples of processing steps include 

25 reduction to p-hydroxyacyl-, reduction followed by 
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dehydration to 2-enoyl-, and complete reduction to the 
saturated acylthioester- The stereochemical outcome of 
these processing steps is also specified for each cycle 

of chain extension. Methylation at the a-carbon or p- 

5 hydroxy is also sometimes observed. 

The biosynthesis of polyke tides is performed by a 
group of chain- forming enzymes known as polyketide 
synthases. Two broad classes of polyketide synthase 
(PKS) have been described in actinomycetes. One class / 

10 named Type I PKSs, represented by the PKSs for the 

macrolides exythromycini oleandomycin, avenriectin, and 
rapamycin and by the PKS for the polyether monensin, 
consists of a different set or "module" of enzymes for 
each cycle of polyketide chain extension. For an example 

15 see Figure 1 (Cort6s, J. et al. Nature (1990) 348:176- 

178; Donadio, S. et al. Science (1991) 2523:675-679; 

Swan, D.G. et al. Mol. Gen. Genet. (1994) 242:358-362; 

MacNeil, D. J. et al. Gene (1992) 115:119-125; Schwecke, 

T. et al. Proc. Natl. Acad. Sci. USA (1995) 92:7839-7843; 

20 also Patent application W098/ 01546) . The genes encoding 
numerous Type I PKSs have been sequenced and these 
sequences disclosed in publicly available DNA and protein 
sequence databases including Genbank, Swissprot and EMBL. 
For example, the sequences are available for the PKSs 
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governing the synthesis of erythromycin (Cortes, J. et 

al. Nature (1990) 348:176-178); accession number X62569, 

Donadio, S. et al. Science (1991) 252:675-679; accession 

number M63677) ; rapamycin (Schwecke, T. et al. Proc. 

5 Natl. Acad. Sci. (1995) 92:7839-7843; accession number 
X86780) ; rifaraycin (August, P.et al. Chem. Biol. (1998) 

5:69-79; accession number AF040570) and tylosin (Eli 
Lilly, accession number U78289) , among many others. 

The term "polyketide synthase" (PKS) as used herein 

10 refers to a complex of enzyme activities responsible' for 
the biosynthesis of polyketides. These enzyme activities 
include P-ketoacyl ACP synthase (KS) , acyltransf erase 
(AT) , acyl carrier protein (ACP) , p-ketoreductase (KR) , 
dehydratase (DH) , enoylreductase (ER) and thioesterase 

15 (TE) but are not limited to these activities. Each of 
these activities lies on a separate protein or 
polypeptide fragment responsible for this activity. Such 
a fragment is termed a "domain". The terms "motif" or 
"signature sequence" used herein refer to a small stretch 

20 of amino acids (usually less than 10 amino acids) within 
a domain responsible (at least in part) for one aspect of 
the catalytic function, for example, choice of substrate. 

The term "extension module" as used herein refers to the 
set of contiguous domains, from a p-ketoacyl -ACP synthase 
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("KS") domain to the next acyl carrier protein ("ACP") 
domain, which accomplishes one cycle of polyketide chain 
extension; this may or may not include domains 
responsible for the reductive processing of the 
5 polyketide chain. The term "loading module" is used to 
refer to any group of contiguous domains that 
accomplishes the loading of the starter xmit onto the PKS 
and thus renders it available to the KS domain of a 
specific extension module. 

10 

BackcrroTind Art 

Several approaches to altering the nature of the 
polyketide product of a PKS by genetic engineering have . 
been proposed: see particularly WO 93/13663 and WO 

15 98/01571. The length of polyketide formed has been 
altered, in the case of erythromycin biosynthesis, by 
specific relocation using genetic engineering of the 
enzymatic domain of the erythromycin-producing PKS that 
contains the chain-releasing thioesterase/cyclase 

20 activity (Cortes, J.et al. Science (1995) 268:1487-1489; 

Kao, CM- et al. J. Am. Chem. Soc. (1995) 117:9105-9106). 

In- frame deletion of the DNA encoding part of the 
ketoreductase domain in module 5 of the erythromycin- 
producing PKS (also known as 6-deoxyerythronolide B 
25 synthase, DEBS) has been shown to lead to. the formation 
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of erythromycin analogues 5, S-dideoxy-S-a-nvycarosyl-S- 
oxoerythronolide B, 5, 6-dideoxy-5-oxoerythronolide B and 

5.6- dideoxy, 6 p-epoxy-5-oxoerythronolide B (Donadio, S. 
et al. Science (1991) 252:675-679) . Likewise, alteration 

5 of active site residues in the enoylreductase domain of 
module 4 in DEBS, by genetic engineering of the 
corresponding PKS-encoding DNA and its introduction into 
Saccharopolyspora erythraea, led to the production of 

6. 7- anhydroerythromycin C (Donadio, S. et al. Proc Natl. 

10 Acad. Sci. USA (1993) 90:7119-7123). 

Patent application WO 00/01827 describes further 
methods of manipulating a PKS to change the oxidation 
state of the p-carbon. Substituting the reductive domain 
of module 2 of the erythromycin-producing PKS with 

15 domains derived from raparaycin PKS modules 10 and 13 led 
to the formation of ClO-Cll olefin- erythromycin A and 
ClO-Cll dihydroerythromycin A respectively. 

The second class of PKS, named Type II PKSs, is 
represented by the synthases for aromatic compounds. 

20 Type II PKSs contain only a single set of enzymatic 

activities for chain extension and these are re-used as 
appropriate in successive cycles (Bibb, M. J. et al. EMBO 

J. (1989) 8:2727-2736; Sherman, D. H. et al. EMBO J. 

(1989) 8:2717-2725; Fernandez -Moreno, M.A. et al. J. 
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Biol. Chem. (1992) 267:19278-19290). The "extender" 
units for the Type II PKSs are usually acetate (malonyl- 
CoA) units, and the presence of specific cyclases 
dictates the preferred pathway for cyclisation of the 
5 completed chain into an aromatic product (Hutchinson, C. 
R. and Fujii, I. Annu. Rev. Microbiol. (1995) 49:201- 
238) • Hybrid polyketides have been obtained by the 
introduction of cloned Type II PKS gene -containing DNA 
into another strain containing a different Type II PKS 
10 gene cluster, for example by introduction of DNA derived 
from the gene cluster for actinorhodin, a blue -pigmented 
polyketide from Streptomyces coelicolor, into an 

anthraquinone polyketide-producing strain of Streptomyces 

galileus (Bartel, P. L. et al. J. Bacterid. (1990) 

15 172:4816-4826). Occasionally, unusual starter units are 
incorporated by Type II PKS, particularly in the 
biosynthesis of oxytetracycline, frenolicin and 
daunorubicin and in these cases a separate AT is used to 
transfer the starter tmit to the PKS. 

20 Fungal PKSs such as the 6-methylsalicylic acid or 

lovastatin PKS typically consist of a single multi -domain 
polypeptide which include most of the activities required 
for the synthesis of the polyketide portion of these 
molecules (Hutchinson C.R. and Pujii I. Annu. Rev. 

25 Microbiol. (1995) 49:201-238). Type II Fiingal PKSs are 
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also known. 

A number of mixed systems comprising polyketide 
synthase and nonribosomal peptide synthase modules have 
been identified including the epothilone and bleomycin 
5 biosynthetic clusters. 

Although large numbers of therapeutically important 
polyketides have been identified, there remains a need to 
obtain novel polyketides that have enhanced properties or 
possess completely novel bioactivity. The complex 

10 polyketides produced by Type I PKSs are particularly 
valuable, in that they include compounds with known 
utility as anthelminthics, insecticides, anticancer, 
immxinosuppressants, antifungal or antibacterial agents. 
Because of their structural complexity, such novel 

15 polyketides are not readily obtainable by total chemical 
synthesis, or by chemical modifications of known 
polyketides. Particular changes that are desired are 
changes to the carbon skeleton by altering the nature of 
the starter and/or extender unit(s) incorporated, changes 

20 to the oxidation level of the P-keto carbon and therefore 
the pattern of oxygen substituents by altering the series 
of reductive steps that occur after chain extension and 
changes to the post PKS "tailoring" steps which generally 
comprise hydroxylation, methylation or glycosylation of 

25 the polyketide molecule. 
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There is also a need to develop reliable and 
specific ways of deploying individual modules in practice 
so that all, or a large fraction/ of hybrid PKS genes 
that are constructed, are viable and produce the desired 
5 polyketide product. Various strategies have been 
described to produce these hybrid PKSs particularly 
utilising recombinant DNA technology and denovo 
biosynthesis. There is a particular need to develop 
methods of manipulating these PKS in a manner that 

10 minimises the alteration to the PKS protein structure. 
Existing methods of achieving these manipulations 
sometimes produce hybrid PKS multienzymes which give the 
desired product at only 1% or less of the rate that the 
unmodified PKS produces product. 

15 WO 93/13663 and WO 98/01571 describe novel methods 

of engineering PKSs, A well-established method of 
altering the nature of the extender unit used at any 
position in the polyketide molecule, particularly 
malonyl-, methylmalonyl- or ethylmalonyl-CoA is by domain 

20 substitution. For example, WO98/01546 and US patent 
6,063,561 disclose methods of accoinplishing this 
modification to form modified erythromycins. Novel 
polyketide molecules, in this case particularly novel 
erythromycins, are produced by the replacement of an 

25 entire AT domain- encoding DNA fragment on the 
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Saccharopolyapora ejtrythraea chromosome with an equivalent 

heterologous AT domain- encoding fragment from another PKS 
cluster. It is well known to those skilled in the art 
that selection of the exact DNA/protein splice sites into 

5 which to insert the heterologous domain requires detailed 
analysis of the corresponding DNA and protein sequences. 

Different researchers choose to use splice sites at 
conserved, semi -conserved or non- conserved regions of the 
protein, or at sites either within or at the boundaries 

10 of the AT domains. A further drawback of this technique 
is that it is hard to predict whether a particular 
heterologous domain will work in any given context. A 
domain that works successfully in one module may not work 
at all in an adjoining module or may produce polyketides 

15 at a vastly reduced yield. Oliynyk, M. et al. (Chem. 

Biol. (1996) 3:833-839) and Ruan et al. (J. Bact. (1997) 

179:6416-6425) have published studies that exchange a 
methylmalonyl-CoA specific AT domain for malonyl-CoA 
specific AT domains in modules of the erythromycin PKS. 
20 Products were observed only for changes in modules 1 and 
2, with module 2 at a vastly lowered yield. Stassi et 

al. (Proc. Natl. Acad. Sci. (1998) 95:7305-9) exchange 

the methylmalonyl-CoA specific AT of module 4 of the 
erythromycin PKS for an ethylmalonyl-CoA specific AT and 
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' again product yield was low even after the addition of 
the crotonyl-CoA reductase gene thought to increase the 
supply of the required ethyltnalonyl-CoA precursor. A 
possible reason for the limiting yields is the structural 

5 or mechanistic non-compatibility of a heterologous AT 
domain with the adjoining KS and ACP domains with which 
it must interact properly for efficient polyketide chain 
synthesis. Consequently, it is often necessary to try 
multiple domain swaps to achieve a novel polyketide- 

10 producing strain that displays adequate efficiency - a 

process made particularly arduous when these changes must 
be made by gene replacement on the chromosome through a 
two step double integration process. The introduction of 
splice sites at the DNA level is time consuming and 

15 technically challenging, requiring careful analysis to 
ensure the PKS protein coding reading frame is not 
disrupted. The introduction of restriction enzyme sites 
often requires changes at the amino acid level which lead 
to further PKS protein structure disruption and 

20 consequent loss of catalytic efficiency. 

A method that could utilise the numerous techniques 
available for site directed mutagenesis to influence the 
AT substrate specificity with minimal disruption to the 
protein tertiary structure would be a valuable addition 

25 to the current techniques. 
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Changes to an active site have been shown to alter 
siibstrate specificity in other systems. For example, in 
an early study, Scrutton et al. (Nature (1990) 343:38-43) 

used site directed mutagenesis to switch the coenzyme 
5 substrate specificity of a glutathione reductase. 

Identifying and changing a ' fingerprint » structural motif 
in the NADP+ binding domain they could convert the enzyme 
into one displaying a marked preference for NAD+. The 
techniques of directed evolution have been used to 
10 improve /change enzyme catalytic function. Of many 

examples in the literature, Zhang et al. (PNAS (1997) 

94:4504-4509) illustrate the conversion of a 
galactosidase to a fucosidase by these techniques. The 
resulting protein bears 6 mutations, of which 3 lie in, 

15 or in close proximity to the active site. 

Minor but directed changes to a PKS domain can make 
significant changes to its catalytic function. Patent 
application WO 00/00500 teaches that an extender 
ketosynthase domain is converted to a decarboxylating 

20 (and hence loading) ketosynthase domain by site directed 
mutagenesis at the active site. US Patent numbers 
6,004,787 and 6,066,721 and Jacobsen et al. Science 

(1997)277:367-369 describe the deletion of residues in 
the KSl active site to inactivate this activity to allow 
25 the production of novel pdlyketides by feeding of 
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synthetic precursors to the modified PKS. 

Several studies have atteitpted to correlate the 
primary amino acid sequence of the AT to determine amino 
acids directly involved with the recognition of the 
5 appropriate substrate, and particularly the nature of the 
substrate side chain (i.e. the malonyl portion of the 
acyl-CoA thioester) . Studies by Haydock et al. (FEES 

Lett. (1995) 374:246-248) correlated the substrate 
specificity of malonyl- or methylmalonyl-CoA specific AT 
10 with a motif 11 amino acids upstream of the known active 
site. Comparisons between this motif and the protein 
structure of a known acyl transferase from E. coli fatty 

acid synthase allowed the authors to assess the proximity 
of the motif residues to the active site (and hence its 
15 ability to select the substrate) . The authors 

acknowledged that "this divergent region thus identified 

lies near the acyl transferase active site though not 

close enough to make direct contact with the substrate" . 

Other studies (Katz, L. Chem Rev. (1997) 97:2557-2575, 
20 Tang, L. et al., Gene (1998) 216:255-265) have correlated 

additional residues with a specific extender unit using 
these residues as a tool to predict the AT substrate 
specificity from a protein sequence derived from 
polyketide gene cluster sequencing projects. It has 
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remained unclear which residues have mechanistic 
importance. In only one case have regions within the PKS 
AT domain been exchanged in an attempt to swap AT 
specificity; patent application WO 00/01838 and Lau et 

5 al. Biochemistry (1999) 38:1643-51) implicated a 

•hypervariable region' at the C- terminus of the AT domain 
in the selection of extender unit. These workers 
interchanged this 25-30 amino acid stretch and showed 
that this change was sufficient to alter the substrate 
10 specificity of the AT, concluding "a short (23-35 amino 

acid) terminal segment present in all AT domains is the 

principal determinant of their substrate specificity. 

Interestingly its length and amino acid sequence vary 

considerably among the knorni AT domains. We therefore 

15 suggest that the choice of extender units by the PKS 

modules is influenced by a Phypervariable region", which 

could be manipulated via combinatorial mutagenesis to 

generate novel AT domains possessing relaxed or altered 

substrate specificity". Surprisingly, our structure 

20 molecular modelling studies indicate this region lies at 
a surface accessible region away from the active site and 
hence is unlikely to directly interact with (and hence 
directly select) the malonyl portion or the substrate 
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used. The effect on substrate specificity is therefore 
likely to be imprecise and due to more indirect effects 
via, for example, disruption of tertiary structure. 

5 Disclosure of Invention 

According to a first aspect of the present invention 
there is provided a method of synthesising a compo\ind 
whereof at least a portion is the product of a polyketide 
synthase (PKS) enzyme complex or is derived from such a 
10 product, said PKS enzyme complex including at least one 
acyl transferase (AT) domain. The method includes a step 

of providing said PKS enzyme complex in which said AT 

( 

domain has been altered to change selectively a minor 
proportion of amino acid residues. The altered 

15 residue (s) may comprise one or more motifs which are 
present in the active site pocket of the AT domain and 
which influence the substrate specificity of the AT 
domain, the alteration affecting the substrate 
specificity; and/or one or more residues of a motif which 

20 influences the substrate specificity of the AT domain and 
which comprises a four-residue sequence corresponding to 
the YASH motif of the AT domain of the first module of 
DEBS, the alteration affecting the substrate specificity. 
Synthesis is then effected by means of said PKS enzyme 

25 complex to produce a compound or mixture of compounds 
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different from what could have been produced by means of 
a PKS enzyme in which said AT domain had not been 
altered. 

The PKS enzyme complex may be at least part of a 
5 modular type I PKS enzyme complex, or it may be derived 
from a type II PKS system, a fungal PKS system or a 
hybrid system comprising PKS and nonribosomal peptide 
synthase modules. 

The present invention teaches that by altering a few 

10 amino acid residues in the AT domain and particularly 
residues close to the AT active site comprising one or 
more residues of a short signature "motif" within the AT 
domain it is possible to influence the acylthioester 
selected by that AT domain. Novel polyketides can be 

15 made by a modified PKS on which the signature motif on 
one or more modules is altered, e.g. being replaced with 
one associated with a different specificity for malonyl 
substrate. Furthermore, the invention provides a method 
of reducing the proportion of mixed polyketide products 

20 that are occasionally found in natural systems due to 
non-specific incorporation of the incorrect extender 
units. Conversely, the invention provides a method of 
giving a mixed population of polyketide products thus 
increasing the diversity of polyketides produced by a 

25 PKS . 
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The invention allows the preparation of a modified 
PKS by substitution of an existing amino acid residue 
motif in the AT that specifies incorporation of one of 
the common extender acylthioesters with another motif 
5 found in another AT specifying an alternative 

acylthioester. This alters the substrate specificity of 
the polyketide synthase when it . is ea^ressed in a 
polyketide -producing organism. 

The DNA sequences have been disclosed for numerous 

10 Type I PKS gene clusters. Comprehensive sequence 

analysis of AT domains derived from Type I PKS modules 
responsible for the formation of macrolides, particularly 
eiythromycin, rapamycin, avermectin, rifamycin, FK506, 
epothilone^ tylosin, and niddamycin, ionophore 

15 polyethers, particularly monensin, and polyenes, 

particularly nystatin, allowed us to identify amino acids 
that are characteristic of AT domains. 

* Figure 2 shows the sequence comparison of these AT 
domains. This sequence comparison has been generated in 

20 a generally conventional way, employing a computer using 
a procedure that creates a multiple sequence alignment 
from a group of related sequences. We used a program 
called PileUp (Wisconsin Package, Genetics Computer Group 
(GCG) , Madison, WI,USA) , which creates a multiple 

25 sequence alignment using simplification of the 
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progressive alignment method of Peng and Doolittle 
(Journal of Molecular Evolution 25; 351-360 (1987)). The 
method used is similar to the method described by Higgins 
and Sharp (CABIOS 5; 151-153 (1989)). The program 
5 executes a series of progressive, pairwise alignments 
that allows a large number of sequences to be compared 
together to form a final alignment throughout all the 
sequences. Gaps can be inserted throughout individual 
sequences to allow alignment of regions of strong 

10 similarity. This is often required as strongly conserved 
regions are often separated by more variable regions, 
both in terms of numbers of amino acids and type of amino 
acids. Different programs use different mathematical 
algorithms to make these comparisons, resulting in 

15 alignments that differ in minor ways. However, it can be 
eacpected that regions of strong homology would still 
align whatever alignment program is utilised. The 
particular motifs that are discussed are marked. 

These motifs include the conserved GQG motif that is 

20 close to the start of the domain, the GHS motif that 

contains the active site serine that covalently binds the 
acyl chain prior to transfer to the ACP, and a LPTY motif 
that is close to the end of the domain. Other residues 
common to all ATs including an arginine, believed to 

25 stabilise the carboxylate group of the acylthioester . 
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Further detailed sec[uence analysis allowed us to identify 
amino acid residues that differed between ATs responsible 
for the incorporation of malonyl-, methylmalonyl- and 
ethylmalonyl-CoA. Some of these amino acids or motifs 
5 had been previously identified during the sequence 

analysis of the clusters as previously discussed. While 
these motifs could predict whether a malonyl- 
/methylmalonyl-CoA might be used they generally fail to 
show a difference between methylmalonyl- vs ethylmalonyl- 

10 CoA or the other larger extender unit commonly used. We 
viewed this as an important requirement for 
identification of the most important and key residues 
involved in substrate recognition and consequently 
residues most suitable for alteration. Closer analysis 

15 identified a string of four residues (location identified 
clearly in Figure 2) of which two residues are virtually 
invariant throughout all ATs, and two residues differ 
consistently depending on the extender unit. 
Particularly, in the vast majority of ATs responsible for 

20 recognition of malonyl-CoA the sequence of residues HAFH 
was identified, in the majority of ATs responsible for 
recognition of methylmalonyl -CoA the equivalent segment 
was substituted by residues YASH. In ATs responsible for 
ethylmalonyl-CoA or other similar sized CoA unit 

25 incorporation the overall motif was different, less 



wo 02/14482 



PCT/GBOl/03642 



20 



conserved but generally displayed the secjuence XAGH 
(where X is most frequently but not limited to T, V or 
H) . We typically use the terms HAFH, YASH and TAGH to 
describe these motifs with respect to malonyl-CoA, 
5 methylmalonyl-CoA and ethylmalonyl/ further CoA 
specificity but use these terms herein to allow 
substitutions in the motif, particularly at residue 1 as 
described. Potential substitutions and the exact 
location of the motif will be clear to those skilled in 
10 the art by inspection of Figure 2 or similar sequence 
analysis. 

There are three possible methods to locate the 
position of the motif within an AT sequence that does not 
appear in Figure 2. It is likely a combination of the 
IS methods will be used. 

I) By simple visual inspection and comparison of 
the sequence to identify the motifs HAFH, YASH 
or TAGH, Since substitutions of residue one 
are often encountered a useful procedure is to 

20 look for an alanine (A) separated by one amino 

acid (usually P, S or G) from a histidine (H) . 

II) By counting amino acids from the active site 
serine. The start of the motif is typically 
(but should not be limited to) between 90 and 
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site motif. 

Ill) By computer generated multiple alignment that 

allows the new sequence to be directly compared 
5 to the sequences and motifs we have annotated 

in Figure 2 or to other ATs. 
It is preferable to use the third method as this 
allows the motif to be identified unequivocally when 
there are substitutions within the motif. This is 
10 particularly necessary in some of the more unusual types 
of AT in which one of the residues can be substituted by 
proline (P) . The third method will also identify the 
motif when the number of residues between the motif and 
the AT active site serine differs significantly from the 
15 norm. The third method will also better identify the 
motif when the same or similar string of amino acids 
occurs elsewhere in the domain. 

A particular feature of these motif residues is the 
relationship of the size of the third residue compared to 
20 the substrate selected. Hence, when malonyl-CoA is 

required the third residue is large (phenylalanine) , when 
tnethylmalonyl-CoA is required this residue is 
intermediate (serine) , and when ethylmalonyl-CoA is 
required this residue is small (glycine) . The inverse 
25 relationship between substrate side chain size and this 
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third residue is particularly noteworthy. Interestingly, 
this relationship applies also when considering the 
incorporation of the more unusual extender units such as 
methoxymalonyl-CoA, required for some cycles of chain 

5 extension during production of for example FK506 (HAGH) . 
Currently, only a single example of an AT responsible 
for the incorporation of a five carbon-CoA unit has been 
disclosed. In this case the AT displays a different 
motif at this point, CPTH, in which only the histidine is 

10 conserved- The incorporation of a proline residue in the 
motif may be indicative of an AT specifying a larger 
substrate. Proline is also found in the motif in ATs that 
incorporate the larger unusual starter acids as seen in 
the case of avermectin and soraphen. Residues in and 

15 around this area, but lying in the active site of the AT 
domain define the specificity of the domain towards the 
substrate chosen. 

Motifs that represent hybrids of motifs for malonyl- 
and methylmalonyl-CoA or methylmalonyl- and ethylmalonyl - 

20 CoA were identified. Particularly, epothilone module 3- 
eaqpected HAPH or YASH (malonyl-CoA or methylmalonyl -CoA 
specific) , found HASH or monensin module 5-expected TAGH 
(ethylmalonyl -CoA specific), found VAGH. Significantly, 
in both these cases the products of the PKS are a mixture 

25 due to the incorporation of 2 different extender units by 
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the module containing the hybrid motif, causing formation 
of monensins A and B and epothilones A and B. However, 
it is known that substrate supply is a significant 
determinant of the proportion of monensins A and B formed 
5 (Liu, H. and Reynolds, ,K.A (1999) J.. Bact. 181:6806- 
6813) . 

Many of the previously-proposed "predictive" motifs 
are unlikely to be the principal determinant of substrate 
specificity because they are not located in the active 

10 site pocket, A particular requirement of any motif that 
can serve to distinguish between substrates is that it 
lies close to the active site and preferably within the 
substrate binding pocket. In this analysis we consider 
the substrate binding pocket to be the part of the pocket 

15 that binds /recognises the malonyl portion of the 

acylthioester rather than necessarily the coenzyme A 
portion. In all probability some of the similarities 
previously identified by sequence analysis are due to 
evolutionary conservation rather than a mechanistic 

20 requirement. In contrast the residues we have identified 
lie in or close to the substrate binding pocket. To 
assess the exact location of the motif in space we 
compared the protein sequence of ATs derived from Type I 
PKS with that of B. coli fatty acid malonyl -CoA:ACP 

25 acyl transferase, for which there is a high resolution X- 
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ray crystal structure (Serre, L. et al., J. Biol. Chem. 

(1995) 270:12961-12964) • While overall level of sequence 
similarity between these proteins is low, key residues 
(and particularly those with mechanistic importance) are 
5 conserved and the overall spatial arrangement of amino 

acids is expected to be conserved. Many groups have used 
this structure as a model AT and it is well known in the 
art that conservation of structure can be greater than 
the level of sequence conservation. Structural analysis 

10 showed that the identified motif would lie within the 
active site pocket opposite the active site serine and 
the arginine thought to be involved in binding the 
substrate carboxylate and close enough to the 
acyltransf erase site to interact with the bound substrate 

15 side chain. The invariant histidine fo\md in the motif 
is thought be part of a catalytic triad with the active 
site serine as is typically found in serine hydrolases 
{Serre et al. Supra) . Figure 3 shows the position of the 

motif loop and importsint active site residues in the 

20 model AT structure. 

Broadly the invention concerns modifying an AT 
domain by changing the four-residue sequence or motif 
responsible for selecting a substrate so that its 
specificity is altered. We may also change a small 

25 • number of other residues close to the active site. 
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Generally the total number of residues changed is less 
than 5% of the residues of the AT. 

The motif is the four-residue sequence corresponding 
to the YASH motif foiand at about residues 334-337 of the 

5 AT domain of the first module of DEBS, numbering as shovzn 
in Fig. 2. It lies in the active site pocket. It 
typically starts 80-110, more particularly 90-100, amino 
acids downstream of the GHS active site motif. 

In a preferred embodiment of this invention 

10 polyketides of desired structure are 'produced by the 
replacement of an existing AT motif on a PKS with an 
alternative one responsible for selection of an 
alternative extender or starter unit, or responsible for 
an altered degree of selectivity (in most cases, 

15 increased selectivity) . This may be carried out in one 
or more of the modules encoding a PKS cluster. One type 
of embodiment is a PKS including two adjoining domains, 
which are "naturally" adjoining or otherwise coupled 
domains, wherein the first of them is an AT domain where 

20 the four-residue motif has been altered to change its 

specificity, the AT domain acting to transfer a substrate 
to the second domain. 

In one class of embodiments, this invention provides 
a PKS multienzyme or part thereof, or nucleic acid 

25 (generally DNA) encoding it, said multienzyme or part 
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comprising a loading module and a plurality of extension 
modules for the generation of a polyketide, preferably 
selected from, macrolides, polyethers, or polyenes / 
wherein the loading or extension modules or at least one 
5 thereof contain a modified AT domain adapted to load and 
transfer an optionally substituted malonyl-CoA residue to 
(preferably) the ACP, The AT domain is preferably 
modified to alter its substrate specificity. This AT 
domain may differ from one naturally found in this 

10 position in the module only by the modification of a few 
amino acids lying in the active site. This modification 
comprises the exchange of all or part of a motif 
particularly but not limited to HAFH with YASH or TAGH or 
vice versa. Optionally, alterations to amino acids 

15 outside this sequence, but preferably lying close to the 
AT active site, are made. 

A second class of embodiments provides a method of 
synthesising polyketides having a desired extension unit 
at any point around the polyketide molecule by providing 

20 a PKS multienzyme incorporating one or more modified AT 
domains and particularly but not limited to an AT domain 
possessing the motif HAFH or YASH or TAGH where these 
motifs replace the existing natural motif- Optionally, 
alterations to amino acids outside this sequence, but 

25 preferably lying close to the AT active site, are made. 
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A third class of embodiments provides a method of 
synthesising polyketides having a desired starter unit by 
providing a PKS multienzyme incorporating a modified AT 
domain in the loading module and particularly (but not 

5 limited to) an AT domain possessing the motif HAFH or 
YASH or TAGH or a motif incorporating a proline residue 
where these motifs replace the existing natural motif* 
Optionally, alterations to amino acids outside this 
sequence, but preferably lying close to the AT active 

10 site, are made. Preferentially, this AT will follow a 

KSQ domain but other loading systems are known in the art 
(e.g. AT-ACP) . Patent application WO 00/00500 describes 
some of the known loading systems. The modification of 
the loading module can be combined with similar 

15 modifications in other extension units. 

A further class of embodiments provides a method of 
synthesising polyketides free of natural co-produced 
analogues and having a desired extender or loading unit 
by replacing an existing hybrid or alternative protein 

20 motif with the sequences HAFH, YASH or TAGH. It is 
particularly useful to make this alteration in the 
epothilone or monensin PKS gene cluster. 

In still further aspects this invention provides a 
method of synthesising a mixed population of polyketides 

25 by providing a PKS multienzyme incorporating an AT with a 
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altered or hybrid motif, particularly, but not limited to 
HASH or VAGH. One particular utility of this method, 
though not limited to this utility, is the production of 
combinatorial libraries of compounds. 

5 In a further aspect the PKS containing a modified AT 

domain may be spliced to a hybrid PKS produced for 
exanple as in WO 98/01546 and WO 98/01571 or WO 00/01827 
or WO 00/00500. It is particularly useful to link such a 
modified PKS to gene assemblies that produce novel 

10 derivatives of natural polyketides, for example 14- 
membered macrolides. 

Each of these aspects and classes of embodiment may 
involve providing nucleic acid encoding the polyketide 
synthase multienzyme and introducing it into a organism 

15 where it can be expressed. Suitable plasmid's and host 
cells are described below. The polyketide synthase so 
produced or portions thereof may be isolated from the 
host cells by routine methods, though it is usually 
preferable not to do so. The host cells may also be 

20 capable of producing the required acylthioester, eg. by 
producing ethylmalonyl CoA for example. It may be 
advantageous to remove the PKS from a strain with a 
particularly strong supply of an undesired acylthioester 
or express the altered PKS in a strain specifically 

25 chosen to have a strong supply of a particular 
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acylthioester, or alternatively to develop media or 
growth conditions to enhance expression of the desired 
product. Conversely, such techniques could be used to 
promote formation of mixtures of products if. so desired. 

5 It may also be beneficial to supply chemical precursors 
to the desired acylthioesters in the media e.g. supply 
diethylethylmalonate or cyclobutane carboxylic acid etc. 

The host cells may also be capable of modifying the 
initial PKS products, e.g. by carrying out all or some of 

10 the biosynthetic modifications normal in the production 
of erythromycin (as shown in figure 4) and for other 
polyketides. Use may be made of mutant organisms such 
that some or all of the normal pathways are blocked, e.g. 
to produce products without one or more "natural" hydroxy 

15 groups or methyl groups or sugar groups. 

The invention should not be limited to the exact 
motifs described. We have described some of the known 
variations within the motif, particularly at residue 1 as 
can be determined by inspection of Figure 2 or by 

20 inspection of similar sequence data. However other 
modifications can be envisaged; substitution of, for 
example, the phenylalanine in the malonyl-CoA motif by 
the similar sized tyrosine may still display the same 
selectivity. Similarly substitution of the small residue 

25 glycine foxind in the motif responsible for ethylmalonyl- 
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CoA/other extender incorporation by for example but not 
limited to alanine. It is well known to those skilled in * 
the art that these and other similar conservative 
substitutions frequently maintain the same selectivity, 

5 Similarly the serine residue found in the motif for 

incorporation of methylmalonyl-CoA could be substituted 
by a residue intermediate in size and/ or displaying a 
similar charge distribution. 

The invention should not be limited to changes only 

10 in this motif- Alterations to other residues around the 
AT domain may also be required to increase the level of 
specificity or catalytic efficiency, i.e. to increase the 
proportion or amounts of the desired products. These 
residues are preferentially close to the substrate 

15 binding pocket. The requirement for these additional 
alterations will depend on the particular context or 
change desired. Particular residues to alter can be 
readily identified by inspection of Figure 2 or other 
similar sequence analysis data or alternatively by 

20 analysis of the structural model. 

Residues that may be altered in addition to the 
motif can be divided into two classes. Some of these 
residues may have been previously identified in the 
motifs used to predict the specificity of a motif (ie. 

25 Haydock et al. (PEBS Lett. (1995) 374:246-248). These 
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residues are preferentially close to the substrate- 
binding pocket. These residues should not be limited to 
the particular examples described. 

I) The first class of potential residues to change 
5 includes residues close to the motif on the polypeptide 

chain. A particular example is the residue immediately 
after the 4 residue motif described in the present 
invention. In malonyl-CoA specific ATs this residue is 
generally serine (S) , i.e. the protein sequence at this 

10 point is generally HAFHS, whereas in methylmalonyl-CoA 
specific ATs this residue can be S but can also be T, G, 
or C for example. Thus to change a methylmalonyl-CoA 
specific AT to a malonyl-CoA specific AT by changing the 
signature motif it may be beneficial also to ensure .that 

15 the residue immediately after the motif is an S. Since 
this residue is close to the motif on the polypeptide 
chain it lies close to the substrate binding pocket. 

II) The second class includes residues that are 
close to the motif or active site in space. These 

20 residues are best identified by reference to the model AT 
structure described previously or another AT structure 
that may be subsequently derived. It is known to those 
skilled in the art that it is possible to thread related 
protein sequences into an existing structure by using 

25 structure molecular modelling or related techniques. 
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Alternatively, an acylthioester may be modelled into the 
active site. These are the preferred methods, but often- 
simple inspection of the existing structure using the 
highly conserved motifs as a reference point gives a 

5 reasonable approximation- 

A particular example of a residue close in space to 
the motif that might be changed is the residue 
immediately after the GHS active site motif. In 
methylmalonyl-CoA specific ATs this residue is generally 

10 glutamine (Q) , i.e. the protein sequence at this point is 
GHSQi whereas in malonyl-CoA specific ATs this residue is 
often V, I or L for example. Thus to change a malonyl- 
CoA specific AT to a methylmalonyl-CoA specific AT by 
changing the signature motif it may be beneficial also to 

15 ensure that the residue immediately after the GHS motif 
is a Q. Since this residue is close to the active site 
serine it lies within the siibstrate-binding pocket, 

A further example of a residue close in space that 
might be altered is the residue lying three residues 

20 downstream of the GQG motif. In methylmalonyl-CoA 

specific ATs this residue is generally tryptophan (W) , 
i.e. the protein sequence at this point is GQGXXW, 
whereas in malonyl-CoA specific ATs this residue is often 
H or T for example. Thus to change a malonyl-CoA 

25 specific AT to a methylmalonyl-CoA specific AT by 
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changing the signature motif it may be beneficial also to 
ensure that this particular residue after the GQG motif 
is a W. Analysis of the model AT structure shows that 
the GQG motif lies close to the active site pocket and 

5 consequently so does this tryptophan, 

A further example of a residue close in space that 
might be altered is the residue 4 residues downstream 
from the conserved arginine referred to above, which is 
believed to stabilise the carboxylate group of the 

10 acylthioester substrate. In malonyl-CoA specific ATs 

this residue downstream of the R is generally methionine 
(M) , i.e. the protein sequence at this point is RXXXMQ. 
In methylmalonyl-CoA specific ATs this residue is 
generally I or L, and in ethylmalonyl-CoA specific ATs it 

15 is often W. Thus, for example, to change a 

methylmalonyl-CoA specific AT to a malonyl-CoA specific 
AT by changing the signature motif it may be beneficial 
also to ensure that this particular residue is a 
methionine. Analysis of the model AT structure shows 

20 that this residue lies close to the active site pocket. 

In further aspects the present invention provides 
vectors, such as plasmids or phages (preferably 
plasmids) , including nucleic acids as defined in the 
cibove aspects and host cells particularly 

25 Saccharopolyspora or Streptomyces species transformed 
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with such nucleic acids or constructs. It will be 
readily apparent to those skilled in the art that there 
are multiple molecular biological methods for achieving 
the desired alterations to the AT domain, particularly at 

5 the nucleic acid level, e,g, techniques of site directed 
mutagenesis or directed evolution. Suitable plasmid 
vectors and genetically engineered cells suitable for 
expression of PKS genes with modules incorporating an 
altered AT domain can readily be designed or selected by 

10 those skilled in the art. They include those described 
in WO 98/01546 as being suitable for expression of hybrid 
PKS genes of Type I. Examples of effective hosts are 
Saccharopolyspora erythraea, Streptomyces coelicolor, 

Streptomycee avermitiliB, Streptomyces griseofuscus, 

15 Streptomyces cinnamonensis, Streptomyces fradlae, 

Streptomyces longi spare flavus, Streptomyces 

hygroscopicus , Micromonospora griseoirubida, Streptomyces 

lasaliensis, Streptomyces venezuelae, Streptomyces 

antiJbioticus, Streptomyces lividans, Streptomyces 

20 rimosus, Streptomyces alhus, Amycolatopsis medi terranei , 

and Streptomyces tsukubaensis . These include hosts in 

which SCP2* -derived plasmids are known to replicate 
autonomously, such as for example S. coelicolor, S. 
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avermltllis and S. griaeofuscusj and other hosts such as 

Saccharopolyspora erythraea in which SCP2*- derived 

plasmids become integrated into the chromosome through 
homologous recombination between sequences on the plasmid 

5 insert and on the chromosome; and all such vectors which 
are integratively transformed by suicide plasmid vectors. 
A plasmid with an int sequence will integrate into a 
specific attachment site on the host's chromosome. 

It is apparent to those skilled in the art that the 

10 overall sequence similarity between nucleic acids 
encoding comparable AT domains from Type I PKSs is 
sufficiently high and the domain organisation of 
different Type I PKSs so consistent between different 
polyketide -producing organisms, that the processes for 

15 obtaining novel hybrid polyketides described will be 

generally applicable to all natural modular Type I PKSs 
or their derivatives. 

The present invention will now be illustrated, but 
is not intended to be limited, by means of some examples, 

20 Amino acids have been defined throughout by their 

standard one letter codes as follows. A-alanine, R- 
arginine, N-asparagine^ D-aspartic acid, C-cysteine, Q- 
glutamine, E-glutamic acid, G-glycine, H-histidine, I- 
isoleucine, L-leucine, K-lysine, M-methionine, F- 
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phenylalanine/ P-proline, S-serine, T- threonine, W- 
tryptophan, Y- tyrosine and V-valine. 

Brief Description of Drawings 

5 Figure 1 is a diagram showing the functioning of 6- 

deoxyerythronolide B synthase (DEBS) , a modular PKS 
producing 6-deoxyerythronolide B, a precursor of 
erythromycin A. 

Figure 2 gives the amino acid sequence comparison of 

10 the AT domains of representative Type I PKS gene 

clusters. The motifs GQG, GHS and LPTY are marked at the 
base of the figure along with the arginine and the motif 
defined in the invention as defining specificity. The 
abbreviations used at the side to define the PKS used 

15 are: ave: avermectin, debs: erythromycin, epo: 

epothilone, sor: soraphen, fkb: FK506, rap: rapamycin, 
tyl: tylosin, mon: monensin, nid: niddamycin, nys: 
nystatin, rif : rifamycin. The numbers represent the 
module number. The letter a at the end of the 

20 designation indicates malonyl-CoA specific AT, the letter 
p indicates methylmalonyl-CoA specific AT, and the letter 
b indicates ethylmalonyl-CoA specific AT. Further types 
of AT with unusual or ill -defined AT specificity are 
indicated with letter x. Due to the numbers of sequences 

25 considered, in the pileup each section of 50 amino acids 
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spreads over two pages. The sequences of the monensln 
ATs are unpublished. They are set out in PCT/GBOO/02072 . 

Figure 3 shows a three-dimensional representation of 
the active site of the B. coli acyltrsinsf erase. The 

5 spatial arrangement of the motifs described in the text 
are shown by arrows and the atoms shown in bold. 

Figure 4 shows the enzymatic steps that convert 6- 
deoxyerythronolide B into erythromycin A in 
Saccharopolyspora erythraea . 

10 Figure 5 shows the DNA sequence from the monensin 

PKS encoding the loading AT used in Example 8 . 

Modes for Carrying Out the Invention 

15 Example 1 

Construction of plasmid pHP41 

Plasmid pHP41 is a pCJR24 -based plasmid containing 
the DEBSl PKS gene comprising a loading module, the first 
and second extension modules of DEBS and the chain 
20 terminating thioesterase. The motif YASH of the AT 

domain of first module has been altered to HAFH. Plasmid 
pHP41 was constructed by several intermediate plasmids as 
follows. Plasmid pDlAT2 (Oliynyk, M. et al* Chem. Biol. 

(1996) 3:833-839) was digested with Ndel and Xbal. A 
25 -llkbp fragment was isolated by gel electrophoresis and 
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the DNA purified from the gel. This fragment was ligated 
into pCJR24 (Rowe, C.J. et al. Gene (1998) 216:215-223) 

that had been linearised by digestion with Ndel and XbaJ 

and treated with alkaline phosphatase. The ligation 
5 mixture was used to transform elect rocompetent E. coli 

DHIOB cells and individual clones checked for the desired ^ 
plasmid pCJR26. Plasmid pCJR26 was identified by 
restriction pattern. pCJR26 was transformed into E. coli 

strain ET12567 (McNeil, D.J. et al. Gene (1992) 111:61- 
10 68) and an individual colony grown overnight to isolate 

demethylated DNA. This DNA was linearised using Mad and 

Avrll and the -13kb fragment (Fragment A) isolated by gel 

electrophoresis and purification from the gel. 

A DNA segment of the eryAI gene (start nucleotide 
15 45368, end nucleotide 34734) from S.erythraea extending 

from nucleotide 42104 to nucleotide 41542 was amplified 
by PGR using the following oligonucleotide primers; 5'- 
TTTTTTTGGCCAGGGTTGGCAGTGGGCGGGCA-3* and 5'- 
TTTTTACGGCCAGCCGCTTGGCGCGGAT-3'. The DNA from a plasmid 
20 designated pCJR65 derived from pCJR24 and DEBSITE was 

used as a template. The design of the primers introduced 
a MscI site at nucleotide 42105 and the second primed 

across a BstXI site at position 41546, The 574bp PGR 
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product was treated with T4 polynucleotide kinase and 
ligated to plasmid pUC18 that had been linearised by- 
digestion with Smal and then treated with alkaline 
phosphatase. The ligation mixture was used to transform 

5 electrocompetent £?• coli DHIOB and individual clones 
checked for the presence of the desired plasmid pHP39. 
Plasmid .pHP39 was identified by restriction pattern and 
sequence analysis, Demethylated DNA was produced by 
transforming E. coli strain ET12567 with plasmid DNA. 

10 The resulting DNA was linearised by digestion with Mscl 
and BstXI and the resulting 552bp fragment (Fragment B) 
isolated by gel electrophoresis and purified from the 
gel. A DNA segment of the eryAI gene from S.erythraea 
extending from nucleotide 41557 to nucleotide 41120 was 

15 amplified by PGR using the following oligonucleotide 
primers; 5 ' -CGGTGCCTAGGTGCACCGACTCCCAGTCC-3 5'- 
TTTTTCCAAGCGGCTGGCCGTGGACCACGCGTTCCACTCCTCGCACGTCGAGACGAT 

-3*. DNA from plasmid pCvJR65 was used as a template. 
The design of the primers introduced an Avarll site at 
20 nucleotide 41125 and the second primed across a BstXI 
site at nucleotide 41557 and mutated the amino acid 
sequence YASH to HAFH (encoded by nucleotides 41537- 
41526) . The 442bp PGR product was treated with T4 
polynucleotide kinase and ligated to plasmid pUC18 that 
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had been linearised by digestion with anal and then 

treated with alkaline phosphatase. The ligation mixture 
was used to transform electrocompetent E. coli DHIOB and 

individual clones checked for the presence of the 
5 desired plasmid pHP40. Plastnid pHP40 was identified by 
restriction pattern and sequence analysis, Plasmid pHP40 
was linearised by digestion with restriction enzymes 
Avrll and BstXI, and a 427bp fragment (Fragment C) 

isolated by gel electrophoresis and purified from the 
10 gel. Fragments A, and C were ligated together and the 
resulting ligation mixture used to transform 
electrocompetent E, coli DHIOB. Individual clones were 

checked for the presence of an insert derived from DEBSl. 
The resulting plasmid was designated pHP41. Sequence 
15 analysis was used to confirm the clone contained the 
correct motif HAFH. 

Example 2 

Construction of S. erythraea NRRL2338 JC2/pHP41 and 
20 production of triketides 

S. erythraea NRRL2338 JC2 contains a deletion of the 

eryAI, eryAII and eryAIII apart from the TE (Rowe, c.J. 
et al. Gene 216, 215-223) , Plasmid pHP41 was used to 

transform S. erythraea NRRL2338 JC2 protoplasts using the 
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TE as a homology region. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 ^g/ml 
thiostrepton. S. erythraea NRRL2338 JC2 (pHP41) was 
plated onto SM3 agar (see patent application WO 00/01827) 

5 containing 40 jig/ml thiostrepton and allowed to grow for 
11 days at 30"C. Approximately Icm^ of the agar was 
homogenised and extracted with a mixture of 1.2ml ethyl 
acetate and 20 |il formic acid. The solvent was decanted 
and removed by evaporation and the residue dissolved in 

10 methanol and analysed by GC/MS. The major products were 
identified by comparison with authentic steuadards 
(Oliynyk, M. et al. Chem. Biol. (1996) 3:833-839) as 

triketide lactones (2S, 3R, 5R) -2-methyl-3 , 5-dihydroxy-n- 
hexanoic 8-lactone (AAP, i.e. Acetate, Acetate, 

15 Propionate incorporation) , (2S, 3R, 5R) -2-methyl-3 , 5- 

dihydroxy-n-heptanoic 8-lactone (PAP), (2R,3S,4S,5R) 2, 
4-dimethyl-3 , 5-dihydroxy-n-heptanoic 5-lactone (PPP) and 
(2R,3S,4S,5R) 2, 4 - dime thyl-3 , 5 -dihydroxy-n- hexanoic 8- 
lactone (APP) . These products were identified as their 

20 ammonium adducts corresponding to exact mass 144, 158, 

172 and 158. Four products were produced because in this 
strain, and under the conditions of the experiment the 
loading module loads both acetate and propionate and the 
modified AT loads malonyl-CoA and methylmalonyl-CoA. 
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Only three triketide lactone peaks could be observed in 
the GC/MS spectra under standard conditions, this was due 
to the co-elution of the equivalent mass APP and PAP 
compounds. An isocratic gradient was used to verify this 
5 peak was comprised of two components. In further sets of 
e^qjeriments erythraea JC2 (pHP41) was used to 

inoculate 5ml TSB containing 5 ^ig/ml thiostrepton. After 
three days growth 1.5ml of this culture was used to 
inoculate 25ml SM3 media containing 5 jig/ml thiostrepton 
10 in a 250ml flask. The flask was incubated at 30 'C, 

250rpm for 6 days. At this time the supernatant was 
adjusted to pH3.0 with formic acid and extracted twice 
with an equal volume of ethyl acetate. The solvent was 
removed by evaporation and the residue cinalysed by GC/MS. 
15 In each experiment we could identify the 4 prpducts AAP, 
PAP, PPP and APP but the absolute ratios and quantities 
were variable, presumably depending on exact media and 
growth conditions within each flask (figure 6) . 

20 Example 3 

Construction of S, erythraea NRRL2338 (pHP41) and 

its use to produce 12-desmethvl erythromycin B. 

Plasmid pHP41 was used to transform S. erythraea 

NRRL2338 protoplasts. Thiostrepton resistant colonies 
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were selected on R2T20 agar containing 40 ng/ml 
thiostrepton . Several clones were tested for the 
presence of pHP41 integrated into the chromosome by 
Southern blot hybridisation of their genomic DNA with DIG 
5 labelled vector DNA. A clone with a correctly integrated 
copy of pHP41 was identified in this way. S. erythraea 

NRRL2338 {pHP41) was used to inoculate 5ml TSB containing 
5 |xg/ml thiostrepton. After three days growth 1,5ml of 
this culture was used to inoculate 25ml EzyP media (see 
10 patent application WO 00/00500) containing 5 |ig/ml 

thiostrepton in a 250ml flask. The flask was incubated 
at 30 'C, 250rpm for 6 days. At this time the supernatant 

was adjusted to pH9.0 with ammonia and extracted twice 
with an equal volume of ethyl acetate. The solvent was 

15 removed by evaporation and the residue analysed by 
HPLC/MS. A peak of molecular mass m/z (M+H)=704 was 
observed required for C-12 desmethyl erythromycin B in 
addition to a peak corresponding to erythromycin A 
(M+H)=734. Other peaks corresponding to partially 

20 processed erythromycin intermediates could be identified. 



Example 4 

Construction of plasmid pHP048 
Plasmid pHP048 is a pCJR24 -based plasmid containing the 
25 DEBSl PKS gene comprising a loading module, the first and 
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second extension modules of DBBSl and the chain 
terminating thioesterase. The motif YASH of the AT 
domain of first module has been altered to HASH. Plasmid 
pHP048 was constructed by several intermediate plasmids 
5 as follows. 

A DNA segment of the eryAI gene from S , erythretea 

extending from nucleotide 41557 to nucleotide 41120 was 
amplified by PGR using the following oligonucleotide 
primers/ 5 « -CGGTGCCTAGGTGCACCGACTCCCAGTCC-3 ' and 5'- 
10 TTTTTCCAAGCGGCTGGCCGTGGACCT^CGCGTCGCACTCCrCGCACGTCGAGA^ 

-3'. The DNA from plasmid pCJR65 was used a as template. 
The design of the primers introduced a Avrll site at 

nucleotide 41125 and the second extended to a BstXI site 

at nucleotide 41557, also mutated the amino acid sequence 
15 YASH (encoded by nucleotides 41537-41526) to HASH. The 
442bp PGR product was treated with T4 polynucleotide 
kinase and ligated to plasmid pUC18 that had been 
linearised by digestion with Sinai and then treated with 

alkaline phosphatase. The ligation mixture was used to 
20 transform electrocompetent E. coli DHIOB and individual 

clones checked for the presence of the desired plasmid 
pHP022. Plasmid pHP022 was identified by restriction 
pattern and sequence analysis. Plasmid pHP022 was 
linearised by digestion with restriction enzymes Avrll 
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and BstXl, and the fragment (Fragment D) isolated by gel 

electrophoresis and purified from the gel. Fragment D 
was ligated with Fragments A and B described previously 
and the resulting ligation mixture used to transform 
electrocompetent E. coli DHIOB. Individual clones were 

checked for the presence of an insert derived from DEBSl. 

The resulting plasmid was designated pHP048. Sequence 
analysis was used to confirm the clone contained the 
correct motif HASH. 

Example 5 

Construction of S. eir^thraea NRRL2338 JC2 (pHPQ48) 
and its use to produce triketides 

5. erythraea NRRL2338 JC2 contains a deletion of the 

15 eryAI, eryAII and ear/AIII apart from the TE (Rowe, C.J. 
et al. Gene 216, 215-223) . Plasmid pHP048 was used to 

transform S. erythraea NRRL2338 JC2 protoplasts using the 

TE as a homology region. Thiostrepton resistant colonies 

were selected on R2T20 agar containing 40|ig/ml 

20 thiostrepton. S. erythraea JC2 (pHP048) was used to 

inoculate 5ml TSB containing 5 ug/ml thiostrepton. After 
three days growth 1.5ml of this culture was used to 
inoculate 25ml SM3 media containing 5 pg/ml thiostrepton 
in a 250ml flask. The flask was incubated at 30 °C, 



5 



10 
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250rpm for 6 days. At this time the supernatant was 
adjusted to pH3.0 with formic acid and extracted twice 
with an equal volume of ethyl acetate. The solvent was 
removed by evaporation and the residue analysed by GC/MS. 

5 A mixture of products were identified as their ammonium 
adducts corresponding to the AAP, PAP, APP and PPP 
triketide lactones as described in example 2. In this 
example, under the media/growth conditions described the 
PKS with the HASH change is more catalytically active 

10 than the HAFH change (exanple 2) as judged by total 

amounts of triketide lactone produced, however in this 
case the modified PKS appears to display lower 
selectivity towards acetate as judged by the ratio of AAP 
to PPP triketide lactone. 



Example 6 

Construction of plasmid pHP47 

Plasmid pHP47 is a pCJR24-based plasmid containing 
20 the DEBSl PKS gene comprising a loading module, the first 
and second extension modules of DEBSl and the chain 
terminating thioesterase . The motif YASH of the AT 
domain of first module has been altered to VAGH. Plasmid 
pHP47 was constructed by several intermediate plasmids as 
25 follows. 
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A DNA segment o£ the eryAI gene from S.erythraea 

extending from nucleotide 41557 to nucleotide 41120 was 
amplified by PGR using the following oligonucleotide 
primers; 5*"CGGTGCCTAGGTGCACCGACTCCCAGTCC-3' and 5*- 
5 TTTTTCCAAGCGGCTGGCCGTGGACGTCGCGGGGCACTCCTCGCACGTCGAGACGAT 
-3*- The DNA from plasmid pCJR65 was used as a template . 
The design of the primers introduced a Avrll site at 
nucleotide 41125 and the second extended to a BstXI site 

at nucleotide 41557, also mutated the amino acid secjuence 
10 YASH (encoded by nucleotides 41537-41526) to VAGH. The 
442bp PGR product was treated with T4 polynucleotide 
kinase and ligated to plasmid pUGlB that had been 
linearised by digestion with Smal and then treated with 

alkaline phosphatase. The ligation mixture was used to 
15 transform electrocompetent E. coli DHIOB and individual 

clones checked for the presence of the desired plasmid 
pHP46. Plasmid pHP46 was identified by restriction 
pattern and sequence analysis- Plasmid pHP46 was 
linearised by digestion with restriction enzymes Avrll 

20 and BstXI, and the fragment (Fragment E) isolated by gel 

electrophoresis and purified from the gel. Fragment E 
was ligated with Fragments A and B described previously 
and the resulting ligation mixture used to transform 
electrocompetent B. coli DHIOB, Individual clones were 
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checked for the presence of an insert derived from DEBSl. 

The resulting plasmid was designated pHP47.' Sequence 
analysis was used to confirm the clone contained the 
correct motif VAGH. 

5 

Example 7 

Construction of plasmid pLSOOV 

Plasmid pLSOO? contains the crotonyl-CoA reductase 
(CCR) gene from S. cinnamonensis that is believed to 

10 influence the level of ethylmalonyl-CoA within the cell. 

Plasmid pSG142 (Gaisser et al. Mol. Microbiol. (2000) 36 

391-401) places genes under the control of the acti 
promoter and can be used to integrate either in the right 
hand side of the erythromycin gene cluster or in the act 
15 promoter region of a previously transformed actinomycete . 
Two oligonucleotide primers; 5*- 
GGCAAACATATGAAGGAAATCCTGGACGCG-3' and 5'- 

TCCGCGGATCCTCAGTGCGTTCAGATCAGTGC-3* were used to amplify 
the S. cinnamonensis CCR gene using genomic DNA as 

20 template. The design of the primers incorporated Ndel 

and BairiHI restriction sites to facilitate cloning. The 

1.4kb PCR product was isolated by gel electrophoresis and 
purified from the gel and ligated with pSG142 that had 
been digested with Ndel and Bglll. The resulting 



wo 02/14482 



PCT/GBOl/03642 



49 

ligation mixture was used to transform electrocompetent 
E. coli DHIOB cells, Plasmid pLS003 was identified by 

restriction analysis and sequencing to ensure errors were 
not introduced during amplification. A discrepancy with 
5 the published secjuence was identified. However, further 
analysis by comparison with other published CCR protein 
sequences indicated pLS003 was correct. Plasmid pLS003 
was digested with Ndel and Xhal and the resulting 4.5kb 

fragment (fragment F) isolated by gel electrophoresis and 
10 purified from the gel. This fragment was ligated to 

pLSB2 a derivative of pKC1132 containing the actl/actll 
promoter region behind an Ndel site. Plasmid pLSB2 was 

digested with Ndel and Zbal and the resulting -'4kb 

fragment (Fragment G) purified by gel electrophoresis and 
15 purified from the gel. Fragments F and G were ligated 
together and the resulting ligation mixture was used to 
transform electrocompetent B. coli DHIOB cells. Plasmid 

pLS007 was identified by restriction analysis. 

20 Example 8 

Construction of S. erythraea NRRL2338 JC2 

(pHP47/pLS0Q7) and its use to produce triketides 

S. erythraea NRRL2338 JC2 contains a deletion of the 

eryAI, eryAII and eryAIII apart from the TE (Rowe, C.J. 
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et al. Gene 216, 215-223) . Plasmid pHP47 was used to 
transform S. erythraea NRRL2338 JC2 protoplasts using the 
TE as a homology region. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 [ig/i^l 
5 thiostrepton. PLS007 was used to transform protoplasts 
of S. eirythraea NRRL2338 JC2 (pHP47) , thiostrepton and 

apraraycin resistant clones were selected on R2T20 agar 
containing 40 p,g/ml thiostrepton and 50 p.g/ml apramycin 
plus lOraM magnesium chloride and the resistance markers 
10 verified by plating on tapwater media containing the same 
antibiotics. S. erythraea NRRL2338 JC2 (pHP47/pLS007) was 

used to inoculate 5ml TSB containing 5 iig/ml thiostrepton 
and 50 ug/ml apramycin. After three days growth 1.5ml of 
this culture was used to inoculate 25ml SM3 media 

15 containing 5 )ig/ml thiostrepton and 50 jig/ml apramycin in 
a 250ml flask. The flask was incubated at 3 0°C, 250rpm 
for 6 days. At this time the supernatant was adjusted to 
pH3.0 with formic acid and extracted twice with an equal 
volume of ethyl acetate. The solvent was removed by 

20 evaporation and the residue analysed by GC/MS. In this 

experiment amoxonts of triketide product were lower but a 
mixture of products could be identified as their ammonium 
adducts corresponding to exact masses 158 172 and 186. 
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Example 9 

Construction of S. ervthraea. NRRL2338 (pHP47) and 

its use to produce erythromycins. 

Plasmid pHP47 was used to transform S. erythraea 

5 NRRL2338 protoplasts. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 ug/ml 
thiostrepton. S. erythraea NRRL2338 (pHP47) was used to 

inoculate 5ml TSB containing 5 jig/ml thiostrepton. After 
three days growth 1.5ml of this culture was used to 
10 inoculate 25ml EryP media containing 5 fig/ml thiostrepton 
in a 250ml flask. The flask was incubated at 30'C, 250rpm 

for 6 days. At this time the supernatant was adjusted to 
pH9,0 with ammonia and extracted twice with an equal 
volume of ethyl acetate. The solvent was removed by 
15 evaporation and the residue analysed by HPLC/MS. Peaks 
of mass m/z (M+H) =734 corresponding to erythromycin A 
were observed. 

Example 10 

20 Construction of plasmid pSGK051 

Plasmid pSGK051 is a pPFL43 based plasmid (WO 
00/00500) . The motif HAFH of the AT domain of the 
loading domain has been altered to YASH. Plasmid pSGK051 
was constructed by several intermediate plasmids as 
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follows. 

Plasmid pPFL43 was linearised by digestion with 
restriction enzymes Ncol and ^otl and a 858bp fragment 

(Fragment Q) isolated by gel electrophoresis and purified 
5 from the gel. 

A DNA segment of the monensin loading domain from 
nucleotide 16360-17366 (see figure 5 and PCT/GBOO/02072) 
was amplified by PGR using the following oligonucleotide 
primers ; 5 ' - 

10 GGGGACGCGGCCGCAAGGCCCACCACCTGAAGGTCAGCTACGCCTCCCACTCCCCGC 
ACATGGACCCCAT-3 ' and 5 ' -GGCTAGCGGGTCCTCGTCCGTGCCGAGGTCA- 
3 ' . The design of the primers amplified across a Notl 

site at nucleotide 16367 and changed the amino acid 
sequence HAFH to YASH at nucleotides 16398-16409, the 
15 second introduced a Nhel site equivalent to that in 

pPFL43. The DNA from plasmid pPFL43 was used as a 
template. The 1006bp PGR product was treated with T4 
polynucleotide kinase and ligated to plasmid pUC18 that 
had been linearised by digestion with Smal and treated 

20 with alkaline phosphatase. The ligation mixture was used 
to transform electrocompetent E. coli DHIOB and 

individual clones checked for the presence of the desired 
plasmid pGSAT9. Plasmid pGSAT9 was identified by 
restriction pattern and sequence analysis. Plasmid 
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pCSAT9 was linearised by digestion with restriction 
enzymes Notl and Nhel and a 995bp fragment (Fragment R) 

isolated by gel electrophoresis and purified from the 
gel. Plasmid pPFL43 was digested with Ncol and Nhel to 

5 remove a 1.8kb fragment and the larger fragment (Fragment 
S) isolated by gel electrophoresis and purified from the 
gel. Fragments Q, R and S were ligated together and the 
resulting ligation mixture used to transform 
electrocompetent E. coli DHIOB. Individual clones were 

10 checked for the desired plasmid pSGKOSl. The resulting 
plasmid was analysed by restriction digest and sequenced 
to confirm the presence of the correct motif YASH. 

Example 11 

15 Construction of S. erythraea NRRL2338 JC2/pSGK051 

and production of triketides . 

Plasmid pSGKOBl was used to transform S. erythraea 

NRRL2338 JC2 protoplasts using the TE as a homology 
region. Thiostrepton resistant colonies were selected on 
20 R2T20 agar containing 40 pg/ml thiostrepton- S. 

erythraeSL NRRL2338 JC2 (pSGKOBl) was plated onto R2T20 
agar containing 40 (ig/ml thiostrepton and allowed to grow 
for 11 days at 30°C. Approximately Icm^ of the agar was 
homogenised and extracted with a mixture of 1.2ml ethyl 
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acetate and 20 |il formic acid. The solvent was decanted 
and removed by evaporation and the residue dissolved in 
methanol and analysed by GC/MS. The major products were 
identified by comparison with authentic standards as 
5 triketide lactones (2S,3R,4S,5R) -2,4-dimethyl-3,5- 

dihydroxy-n-heptanoic 5-lactone and (2S,3R,4S,5R) -2,4- 
dimethyl -3 , 5-dihydroxy-n-hexanoic S-lactone . 

Example 12 

10 Construction of S. ervthraea NRRL2338 (pSGKQSl) and 

its use to produce erythromycins. 

Plasmid pSGK051 was used to transform fl. erythraea 

NRRL2338 protoplasts. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 ng/ml 
15 thiostrepton. S. erythraea NRRL2338 (pSGKOSl) was plated 

onto R2T20 agar containing 40 p.g/ml thiostrepton and 
allowed to grow for 10 days at 30'C. Approximately 2cm^ 
of the agar was homogenised and extracted with a mixture 
of 1.2ml ethyl acetate and 20 fil dilute ammonia. The 
20 solvent decanted and was removed by evaporation and the 
residue analysed by HPLC/MS. Peaks of mass m/z {M+H)s734 
and 720 could be observed alongside likely products of 
incomplete processing. Comparison to authentic standards 
proved the compounds produced were erythromycin A and 13- 
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methyl erythromycin A. 
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CLAIMS : 

!• A method of synthesising a compound whereof at 
least a portion is the product of a polyketide synthase 
5 (PKS) enzyme complex or is derived from such a product, 
said PKS enzyme complex including at least one 
acyltransferase (AT) domain; said method comprising the 
steps of (i) providing said PKS enzyme complex in which 
said AT domain has been altered to change selectively a 

10 minor proportion of amino acid residues, the altered 

residue (s) comprising one or more residues of one or more 
motifs which are present in the active site pocket of the 
AT domain and which influence the substrate specificity 
of the AT domain, the alteration affecting the substrate 

15 specificity; and (ii) effecting synthesis by means of 

said PKS enzyme complex to produce a compound or mixture 
of compounds different from what could have been produced 
by means of a PKS enzyme iri which said AT domain had not 
been altered. 

20 2. A method according to claim 1 wherein said 

motif comprises a four-residue sequence corresponding to 
the YASH motif of the AT domain of the first module of 
DEBS • 

3. A method of synthesising a compound whereof at 
25 least a portion is the product of a polyketide synthase 
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(PKS) enzyme complex or is derived from such a product, 
said PKS enzyme complex including at least one 
acyltransferase (AT) domain; said method comprising the 
steps of (i) providing said PKS enzyme complex in which 
5 said AT domain has been altered to change selectively a 
minor proportion of amino acid residues, the altered 
residue (s) comprising one or more residues of a motif 
which influences the substrate specificity of the AT 
domain and which comprises a four-residue sequence 

10 corresponding to the YASH motif of the AT domain of the 
first module of DEBS, the alteration affecting the 
substrate specificity; and (ii) effecting synthesis by 
means of said PKS enzyme complex to produce a compound or 
mixture of compounds different from what could have been 

15 produced by means of a PKS enzyme in which said AT domain 
had not been altered. 

4. A method according to claims 1, 2 or 3 wherein 
said motif was located by a) determining the sequence of 
the AT domain and b) performing sequence alignment with a 

20 plurality of sequences of other AT domains. 

5. A method according to any preceding claim 
wherein the PKS enzyme complex is at least part of a 
modular type I PKS enzyme complex. 
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6. A method according to any preceding claim 
wherein said alteration of the AT domain affects less 
than 5% of the residues. 

7. A method according to any preceding claim 
5 wherein said alteration alters a motif selected from 

XAFH, XASH, and XAGH and/or creates such a motif. 

8. A method according to claim 7 wherein the motif 
is XAGH and X is selected from F, V and H. 

9. A method according to claim 7 wherein the motif 
10 is XAFH and X is H. 

10. A method according to claim 7 wherein the motif 
is XASH and X is selected from Y,H,W and V. 

11. A method according to any of claims 1-10 
wherein said alteration produces or alters a motif 

15 containing proline. 

12. A method according to any preceding claim 
wherein in addition to the alteration to one or more 
residues of said motif (s), one or more additional 
residues in or adjacent the substrate binding pocket have 

20 been altered - 

13. A method according to claim 12 wherein said 
additional altered residue (s) comprise one or more of a) 
the residue immediately downstream of the motif, b) the 
residue three residues downstream from the GQG motif, c) 

25 the residue immediately downstream of the GHS motif, and 
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d) the residue four residues downstream of the conserved 
arginine residue. 

14. A method according to any preceding claim 
wherein the alteration produces a motif specific for 

5 malonyl-CoA and the motif is followed by S which was 
produced by alteration if not already present, 

15. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
methylmalonyl-CoA and the motif is followed by S, G, C or 

10 T which was produced by alteration if not already 
present - 

16. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
methylmalonyl-CoA, and the residue following the GHS 

15 motif in the active site is Q which was produced by 
alteration if not already present. 

17. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
malonyl-CoA, and the residue following the GHS motif in 

20 the active site is V, I or L which was produced by 
alteration if. not already present. 

18. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
methylmalonyl-CoA/ and the residue 3 residues downstream 
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of the GQG motif is W which was produced by alteration if 
not already present. 

19. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 

5 malonyl-CoA, and the residue 3 residues downstream of the 
GQG motif is R, H or T which was produced by alteration 
if not already present. 

20. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 

10 malonyl-CoA and the residue 4 residues downstream of the 
conserved R as found as residue 252 in the first module 
of DEBS is M which was produced by alteration if not 
already present. 

21. A method according to any of claims 1-13 
15 wherein the alteration produces a motif specific for 

methylmalonyl-CoA and the residue 4 residues downstream 
of the conserved R as found as residue 252 in the first 
module of DEBS is I or L which was produced by alteration 
if not already present. 

20 22. A method according to any of claims 1-13 

wherein the alteration produces a motif specific for 
ethylmalonyl-CoA and the residue 4 residues downstream of 
the conserved R as found as residue 252 in the first 
module of DEBS is W which was produced by alteration if 

25 not already present. 
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23. A method according to any preceding claim 
wherein the AT domain has an active site with a GHS 
motif, and said motif which is altered starts 80-110 
residues downstream of said GHS motif. 
5 24, A method according to any preceding claim 

wherein said step (i) of providing said PKS enzyme 
complex comprises providing a nucleic acid sequence 
encoding said complex and effecting expression thereof. 

25. A method according to claim 24 wherein 
10 expression is effected is an organism capable of 

producing polyketides. 

26. A method according to claim 24 or claim 25 
wherein said nucleic acid sequence has been subjected to 
site directed mutagenesis so that it encodes said altered 

15 AT domain - 

27. A method according to claim 24, 25 or 26 
wherein the AT domain prior to alteration is naturally 
expressed in a first organism and the altered AT is 
expressed in a second organism which is better able than 

20 the first organism to supply a substrate for which the 

alteration has increased specificity and/or which is less 
well able than the first organism to supply a substrate 
for which the alteration has reduced specificity. 

28. A method according to any preceding claim 
25 wherein said PKS includes said AT domain and a second 
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domain which is naturally coupled thereto prior to the 
alteration thereof to receive a substrate transferred to 
it by the AT; and the alteration causes the AT to act to 
transfer a different substrate to the second domain, 
5 29- A method according to any preceding claim 

wherein said PKS includes said AT domain and its natural 
cognate ACP domain which, prior to the alteration, is 
adapted to receive a substrate transferred to it by the 
AT; and the alteration causes the AT to act to transfer a 
10 different substrate to said cognate ACP domain. 

30. A method according to any preceding claim 
wherein said PKS including the altered AT domain is 
spliced to a hybrid PKS. 

31. A polyketide compound or derivative thereof or 
15 compound whereof a portion is a polyketide or derivative 

thereof, which compound is obtainable by a method 
according to any preceding claim wherein the compound 
differs from a compound resulting from synthesis effected 
by means of said PKS enzyme complex without the 
20 alteration of said AT domain. 

32. Nucleic acid encoding a PKS enzyme complex 
including an altered AT domain as defined in any of 
claims 1-30. 

33. A vector including a nucleic acid according to 
25 claim 32. 
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34. A host organism containing nucleic acid 
according to claim 32 and able to express the PKS enzyme 
complex. 

35. A host organism according to claim 34 which is 
5 adapted to synthesise a compound whereof at least a 

portion is a polyketide resulting from the action of the 
PKS enzyme complex. 

36. A method of synthesising a polyketide synthase 
(PKS) enzyme complex, said PKS enzyme complex including 

10 at least one acyltransferase (AT) domain; said method 

comprising altering said AT domain to change selectively 
a minor proportion of amino acid residues, the altered 
residue (s) comprising one or more residues of one or more 
motifs which are present in the active site pocket of the 

15 AT domain and which influence the substrate specificity 
of the AT domain, the alteration affecting the substrate 
specificity. 

37. A method according to claim 36 wherein said 
motif comprises a four-residue sequence corresponding to 

20 the YASH motif of the AT domain of the first module of 
DEBS. 

38. A method of synthesising a polyketide synthase 
(PKS) enzyme complex, said PKS enzyme complex including 
at least one acyltransferase (AT) domain; said method 

25 comprising altering said AT domain to change selectively 
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a minor proportion of amino acid residues, the altered 
residue (s) comprising one or more residues of a motif 
which influences the substrate specificity of the AT 
domain and which comprises a four-residue sequence 
5 corresponding to the YASH motif of the AT domain of the 
first module of DEBS, the alteration affecting the 
substrate specificity. 

39, A PKS enzyme complex as produced by the method 
of claims 36, 37 or 38. 
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ataveOOx 
atdebsOOp 
atepo06p 
atepoOVp 
atepoOlp 
atepoOSp 
atsoralx 
atfkbOlp 
atfkb09p 
atrap03p 
atrap06p 



atrap04p PLVI 

atrapl3p A EEAQPVETPV VASDVLPLVI 

atrapOlp — - — — — — ^ — — « — 
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IGTDLITG.T AEPDRRLVWL FSGQG^QRPG MGDELAAAYD VFARTRRDVL 
LGDTLITADP NAGSGPWFV YSGQSTLHPH TGHQLAATYS VFADAWGEVL 

-IGAPP ADQADELVFV YSGQGTQHPA MGEQLAAAFP VFADAWHDAL 

Q ...GT AHPHPRLTLL FTGQGAQHRG MGQELYATDP HFAAALDEVC 

Q GT AHPHPRLTLL FTGQGAQHPG MGQELYTTDP HFAAALDEVC 

Q GT AHPHPRLTLL FTGQGAQHPG MGQELYTTDP HFZ^AALDEIC 

QTPRGVRIGS TDADGRLALL FTGQGAQHPG MGQELYTTDP HFAAALDEVC 

EAPESSA EPPRSARRFL FDGQGAQRVG MGRELHGRFP VFAAAWDEVS 

PGLTTATATA KARRVA..FL FDGQGTQRLG MGKELYDSYP AFARAWDTVS 
ALVGPACSQA RVGGDDWWL FSGQGSQLVG MGAGLYERFP VFAAAFDEVC 
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TPAGAARCIA SSSRGKLAFL FTGQGAQTPG MGRGLCAAWP AFREAFDRCV 
TFPGAARCIA SSSRGKLAFL FTGQGAQTPG MGRGLCAAWP AFREAFDRCV 
TPAGAARGRA ASSPGKLAFL FAGQGAQVPG MGRGLWEAWP AFRETFDRCV 
TSPGAVRSIA DSSRGKLAFL FTGQGAQTLG MGRGLYDVWS AFREAFDLCV 
TPQGAVRGKA VSSRGKLAFL FTGQGAQMPG MGRGLYETWP AFREAFDRCV 

GIAR RGRDVAFL FSGQGAQRAG AGRELYASFP VFAQALDEVA 

GVRD RDGRMAFL FTGQGSQRAG MAHDLHAAHT FFASALDEVT 

AVL FTGQGSQRPT MGRALYDAFP VFRDALDTVA 

/VIL FTGQGSQRPT MGRALYDAFP VFRGALDAAA 

GTT.R AETRLAVL FTGQGAQRLG AGRELAARFP AFATALDAAL 

GTA GRGRTAFL FTGQGSQRPG MGRELHDRYP VFADALDEVL 

GAA.TPH RT...AFL FSGQGAQRSG MGRELHAAFP VFAAAFDEW 

ABA.RER ST.*,AFL FSGQGAQRSG MGRELHAAFP VFAAAFDEW 

GTA.GEG PC...AVL FSGQGSQRPG MGRELHARFP VFAAAFDEIT 

.AA.GRT RC...AAL FSGQGAQRLG MGRELHARFP VFARALDTAV 

GRV.GAG RH...AVL FSGQGAQRLG MGRELYERFP VFAEALDVW 

DAV.STG GS...AVL FTGQGAQRLG MGRELYGRFP VFAEALDVW 

GDT.RTG RH...AVL FSGQGSQRLG MGRELYERFP VFAEALDVAI 

GTV.TMG RC...AVL FSGQGSQRLG MGRELYERFP VFAEALDWI 

GTV.RTG RT...AFL FSGQGSQRLG MGRVLYERFP AFAEALDTVL 

HLQ.GTG KR./.AVL FSGQGSQRLG MGRELHERHP VFAEAFDSVL 

GAA.HQR RT...AVL FSGQGSQRPG MGRELAARFP VFADALDDAL 

EAA.GRG RT...AVL FTGQGSQRAA MGRELHEVQP EFAAAFDAVC 

GAV.PTG DRGGLAVL FSGQGSQRPG MGRELHARYP VFAAAFDETV 

AAA.PRT P.GRTAFL FSGQGAQHAL MGHDLYQRFP VYADALDTVL 

GRT.TSG ELAVL FAGQGTQRAG MGRBLYEAYP VFAQAIDEIC 

GVVTGTP VDGKLAVL FTGQGSQWAG MGRELAETFP VFRDAFEAAC 

TRAAGPA RNGGTAFL FTGQGSQRPG MGRQLYDTFD VFAESLDETC 

GSVGPRGGHS GRRRGKTAML FAGQGTQRVG MGRQLYAAHP AYADALDQVL 
GDRAPGVLTG SAKHGKWYV FPGQGSQRLG MGRELYDRYP VFATAFDEAC 
AETASIVRGE AYTEGRTAFL FSGQGAQRLG MGRELYAVFP VFADALDEAF 
PGPRWTGTA PATERRTAFL FSGQGSQRAG SGRGLYRRHP VFAEUVLDEVC 
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TWAVMVSLAR 

TWAVMVSLAR 

TWAVMVSLAR 

LFTMMVSLAA 

LFTMMLSLAA 

LFTTMVSLAA 

MFAVMVSLAS 

LWAVMVSLAA 

LWAVMVSLAA 

LWAVMVSLAA 

LWAVMVSLAA 

LWAVMVSLAA 

LWAVMVSLAS 

LWAVMVALAA 

LWAVMVSLAA 

LFAVMVSLAR 

LFSVMVSLT^ 

LFSVMVSLAA 

LFSVMVSLAA 

LFSIMVSLAA 

LFSIMVSLAA 

SFAVMVSLAA 

SFAVMVSLAA 

CFAVMVGLAA 

CFAVMVGLAA 

SFAVMVGLAA 

CFAVMVGLAA 

SFAMMVGLAA 

SFAMMVGLAA 

SFAVMVGLAA 

CFAVMVGLAA 

LFAMMVGLSA 

SFAMMVALAE 

LFAVMVSLAA 

LFAVMVSLAE 

TPAVWALAA 

LFAVMVSLAA 

LFSIMVSLAE 
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WRSQGVEP 
WRSFGVTP 
WRSWGVEP 
WRSWGVEP 
WRSWGVAP 
WRSWGVEP 
WRAWGVEP 
WQAHGWP 
WQAHGVSP 
WQAAGVRP 
WQAAGVRP 
WQAAGVRP 
WQAAGVRP 
WQAAGVRP 
WQADGVRP 
WQAAGVRP 
WEAAGVRP 
WQAMGVDV 
WQAMGVDV 
WQAMGVDV 
WQAMGVDV 
WQAMGVDV 
WRALGVEP 
WRDLGVHP 
WEAHGVRP 
WRAHGVEP 
WADYGVTP 
WADYGVTP 
WADYGVTP 
WADYGVTP 
WADYGITP 
WADYGITP 
WADQGIEP 
WADHGVTP 
WRACGAVP 
WGACGVSP 
WRSYGIEP 
WRSYGIEP 
WRSYGIEP 
WRSYGIEP 
WGSRGVLP 
■ WRSRGVLP 
WASVGWP 
WESAGVRP 
WSSVGWP 
WSSAGWP 
.WTSLGVTP 
WQSVGVRP 
.WASLGVEP 
.WESVGVRP 
.WRSHGWP 
WRAHGVAP 
WRSYGVHP 
.WJISYGVEP 
WRWL6VEP 
WRSHGVEP 
WRAHGVTP 



CAVLGHSLGE 

DAWGHSIGE 

EAWGHSMGE 

EAWGHSMGE 

DWIGHSMGE 

DAWGHSMGE 

AFWGHSTGE 

DAWGHSQGE 

DAVIGHSQGE 

DAVIGHSQGE 

DAVIGHSQGE 

DAVIGHSQGE 

DAVIGHSQGE 

DAWGHSQGE 

DAVIGHSQGE 

DAVIGHSQGE 

DAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAVIGHSQGE 

7UVVIGHSQGE 

AAVIGHSQGE 

AAWGHSQGE 

AAVIGHSQGE 

AAVIGHSQGE 

AAVIGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

SAVIGHSQGE 

SAVIGHSQGE 

DAVLGHSQGE 

DAVLGHSQGE 

DAVLGHSQGE 

DAVLGHSQGE 

DAWGHSQGE 

DAWGHSQGE 

DAVLGHSQGE 

DAWGHSQGE 

DAVLGHSQGE 

DAVLGHSQGE 

DAVLGHSQGE 

DAWGHSQGE 

EAWGHSQGE 

DAWGHSQGE 

AAWGHSQGE 

AAWfeHSQGE 

DAVAGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 

AAWGHSQGE 



lAAAHVSGGL 

LAAAHVCGAA 

VAAAHVAGAL 

VA7VAHVAGAL 

VAAAHVAGAL 

VAAAHVAGAL 

lAAAHVAGVL 

lAAACVAGAL 

lAAACVAGAL 

lAAACVAGAV 

lAAACVAGAV 

lAAACVAGAV 

lAAACVAGAV 

lAAACVAGAV 

lAAACVAGAV 

lAAACVAGAV 

I AAACVAGAI ' 

lAAATVAGAL 

lAAATVAGAL 

lAAATVAGAL 

lAAATVAGAL 

lAAATVAGAL 

ITU^AHVAGAL 

lAAACVAGAL 

VAAACVAGAL 

lAAACVAGAL 

MAAACVAGAL 

MAAACVAGAL 

MAAACVAGAL 

MAAACVAGAL 

MAAACVAGAL 

MAAACVAGAL 

lAAACWGAI 

lAAVWAGAL 

lAAAWAGAL 

lAAAWAGVL 

lAAAHVCGAL 

lAAAHVCGAL 

lAAAHICGAL 

lAAAHICGAL 

lAAAWSGAL 

lAAAWSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVSGAL 

lAAACVAGAL 

VAAACVAGVL 

lAAAYVAGAL 

lAAAHVAGAL 

IAA7UIVAGVL 

lAAAHVAGAL 

lAAAHVAGAL 
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SLADAARWT 

GAADAARAAA 

SLEDAVAIIC 

SLEDAVAIIC 

SLEDAVAIIC 

SLEDAVAIIC 

SIEDAMRTIC 

SLEDAARWA 

SLEDAARIVA 

SLRDAARIVT 

SMRDAARIVT 

SLRDAARIVT 

SLRDAARIVT 

SLRDAARWT 

SLRDAARSVT 

SMRDAARIVT 

SLEDGARLVA 

SLEDAAAWA 

SLEDAAAWA 

SLEDAAAWA 

SLEDAAAWA 

SLEDAAAWA 

SLDDSARIVA 

SLEDAARIVA 

SLDDAALVIA 

SLDDAARWA 

SLEDAARIVA 

SLEDAARIVA 

SLEDAARIVA 

SLEDAARWA 

SLEDAARIVA 

SLEDAARIVA 

SLDEAARIVA 

TLEDGAKIVA 

SLEDGMRWA 

SLEDGVRWA 

SLKDAAKTVA 

SLKDAAKTVA 

SLKDAAKTVA • 

SLKDAAKTVA 

SLRDGARWA 

SLRDGARWA 

SLEDAAKWA 

TLDDAAKWA 

SLQDAAKWA 

SLEDAAKWA 

SLDDAAKWA 

SLQDAAKWA 

SLEDAAKWA 

SLEDAAKWA 

SLADAARWA 

TLDDAAKWA 

SLDDAARVTA 

TLEDAAKLW 

SLEDAARWA 

TLEDAAKLVA 

SLEAAAKWA 
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atave02a 
ataveOSa 
atave04a 
ataveOBa 
atave03a 
atrap02a 
atraplla 
atrapOda 
atrapl2a 
atrapOSa 
atrap09a 
atfkb03a 
atfkb07x 
atfkbOBx 
atnidOla 
atnid03a 
atnld02a 
atnidOOa 
atfkblOa 
atrapl4a 
atmonOSa 
atmonOSa 
atmon09a 
atepo02a 
atepoOSx 
atepoOSa 
atepoOOa 
atepo04a 
atnidOTa 
attyl07a 
atsor02a 
atsorbla 
atnys09a 
atnysl2a 
atnysl6a 
atnysl7a 
atnysOSa 
atnyslSa 
atnysOVa 
atnysOSa 
atnysOSa 
atnys06a 
atnys04a 
atnysl4a 
atnysOOa 
atnyslOa 
atnysl8a 
atnyslSa 
atavelOa 
atrif02a 
atmonOSa 
atavel2a 
atrif09a 
atmonOOa 
attylOSa 



QBATIMQTMP P. 
QBATLMQTMP P. 
QRATLMQTMP P, 
QRATLMQTMP P. 
QRATLMQTMP P. 
ARARLMQALP AG 
ARARLMQALP AG 
ARARLMQALP AG 
ARARLMQALP AG 
ARARLMQALP PG 
ARARLMQALP AG 
ARARLMQALP PG 
ARSRLMDELP TG 
TRARLMHTLP PP 
lU^LMGQLP HG 
ARAHLMGQLP HG 
ARAHLMGQLP HD 
ARAHVMGQLP HG 
ARGRALRALP P. 
ARGRALRTTP P. 
ARARLMGGLP EG 
ARARLMGGLP EG 
ARARLMGGLP EG 
ARGRLMQGLS AG 
ARGRLMQGLS AG 
ARGRLMQALP AG 
ARGRLMQALP AG 
ARGRLMQALP AG 
ARGRLMQRLP EG 
ARGRLMQRLP PG 
ARAKLMQALP QG 
ARAKLMQALP QG 
ARGRLMQALP DG 
ARGRLMQALP EG 
ARASLMDALP VG 
ARASLMDALP VG 
ARARLMQALP RG 
ARAGLMQALP RG 
ARATLMQALP AG 
ARATLMQALP TG 
ARATLMQALP TG 
ARATLMQALP AG 
ARAVLMQSLP EG 
ARAALMQRLP AG 
ARATLMQALP AG 
ARATLMQALP TG 
ARATAMSELP PG 
ARASLMQQLP RD 
ARGRLMQGLP SG 
ARGRLMQALP AG 
ARGRLMQALP AG 
ARGRLMEQLA PG 
ARGRLMQALA PG 
TRGRLMQAVR AP 
ARGRLMQALP AG 
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.GTMTTLH TTPHHIT.,H HLTAHE...N DLAIAAINTP 
.GTMTTLH TTPHHIT. .H HLTAHE. • .N DLAIAAINTP 
►GTMTTLH TTPHHIT. .H HLTAHE...N DLAIAAINTP 
.GTMTTLH TTPHHIT. .H HITAHE.-.N DLAIAAINTP 
.GTMTTLH TTPHHIT. .H HLTAHE. . .N DLAIAAINTP 

.GVMAAVP VSEDEARAVL G E GVEIAAVNGP 

.GVMVAVP VSEDEARAVL G E GVEIAAVNGP 

.GVMVAVP VSEDEARAVL G E GVEIAAVNGP 

.GVMVAVP VSEDEARAVL G .*...E GVEIAAVNGP 

.GVMVAVP VSEDEARAVL G E GVEIAAVNGP 

.GVMVAVP VSEDEARAVL G E GVEIAAVNGP 

.GAMAAVS ASERDALPLL C E GVEIAAVNGP 

.GAMVTVL TSEENALRAL R P GVEIAAVNGP 

GAMVTVL TSEEEARQAL R P GVEIAAVFGP 

GAMLSVQ AAEHDLDQLA HT. . .H GVEIAAVNGP 

.GAMLSVQ AAEHDLDQLA ...:HT...H GVEIAAVNGP 

■ GAMLSVQ AAEHDLDQLA HT...H GVEIAAVNGP 

, GAMLSVQ AAEHDLDQLA HT. . .H GVEIAAVNGP 

.GAMTAVE GSPAEVG..A FTD LDIAAVNGP 

-GAMVALR AGEEEVR, .E FLSRTG. . .A ALDLAAVNSP 
GAMCAVQ ATPAELAA. . . .DVDG- . .S AVSVAAVNTP 
. GAMCAVQ ATPAELT^. . . . DVDD. . . S GVSVAAVNTP 
. GAMCAVQ ATPAELAA . . . . DVDG . . . S SVSVAAVNTP 

■ GAMVSLG APEAEVA. .A AVAPHA.'. .A SVSIAAVNGP 
GAMVSLG APEAEVA. - A AVAPHA. . .A SVSIAAVNGP 
-GAMVSIA APEADVA. .A AVAPHA- . .A LVSIAAVNGP 
.GAMVSIE APEADVA. .A AVAPHA. . .A SVSIAAVNAP 
.GAMVAIA ASEAEVA. .A SVAPHA. . .A TVSIAAVNGP 

GAMVAVR ATEQEVAELE WIAGGR AV.VAAFNGP 

, GAMVSVR AGEDEVRAL ; - LAGRE . . . D AVCVAAVNGP 

GAMVTLR ASEEEVRDL. ,LQPYD. . .G RASLAALNGP 
>GAMVTLQ ASEQEARDL. . LQAAE • . . G RVSLAAVNGH 
GAMIAVQ ASEADVAPL. .LAGHE...D QVAIAAVNGP 
GAMVALE AAEDEVLPL. .LEGLT...D RVSVAAVNGP 

■ GVMVAVE AAEAEWPL. .L...V...D GVAIAAVNGP 
GVMVAVE AAEAEWPL. -L. . .V. . .D GVAIAAVNGP 
, GAMLAIR ATEDEVTPH . ■ L . . . T . . . D DVSIAAVNGP 
. GAMVAVE ATEDEVSPL . . L . . . T . . . D GVAIAAINGP 
. GAMIAVQ ATEDEVTPH . . L . . . T . . . D DVAIAAINGP 
, GAMIAVQ ATEDEVTPH . . L . . . T . . . D EVAIAAVNGP 
GAMIAIQ AAEDEVTQH . . L . . . T . . . D DVSIAAVNGP 
GAMIAVQ ATEDEVIPH . . L . . . T . . . D EVAIAAVNGP 
, GAMIAVQ ATEDEVLPL . . L . . . T . . . D DVSIAAVNSP 
GAMIAVB ATEDEVTPL . . L • . . T . . . D GVSLAAVNGP 
GAMAALE ATEDEVAPL . . L . . . G . . . A HLALAAVNGP 
GAMIAIQ ATEDEIAAH . . L . • . D . . . D TVAIAAVNGP 
GAMVALE ATEDEVRPL . . L , . . T . . . D DLAIAAVNAP 
GAMVALE ATEDEVAPL. .L. . .T. , .D GVALAAVNGP 
GAMVAIE ASEDEILPL. .PDEYA..-S RVAHAAVNGP 

GAMVAVQ ATEDEVAPL. .LDGT VCVAAVNGP 

GAMLAVQ AAEDDVLPL. .LAGQE..,E RLSLAAVNGP 
GAMVAVR ASEAEAR. .Q ALDGRE. • .A RVSVAAVNGP 

GAMVAVA ASEAEVAELL G D GVELT^VNGP 

GAMAAWQ ATADEAA. .E QLAGHE. . .R HVTVAAVNGP 
GAMAALR ATAEEI APL . . LERRA . . . G ELALAAVNGP 
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ataveOOx RSTWSGARK AVADLVADLT AAQVRTRMIP -VDVPAHSPL MYAIEERW. Load AT 
atdebsOOp RSVLLTGSPE PVARRVQELS AEGVRAQVIN .VSMAAHSAQ VDDIAEGMR. Load AT 
atepoOSp RSTVLAGEPA 7\LSEVLAALT AKGVFWRQV. KVDVASHSPQ VDPLREEL.I 
atepo07p RSTVLAGEPA ALSEVLAALT AKGVFWRQV. KVDVASHSPQ VDPLREEL.I 
atepoOlp RSTVLSGEPA AIGEVLSSLN AKGVFCRRV. KVDVASHSPQ VDPLREDL.L 
atepoOSp RSTVLAGEPA ALAEVLAILA AKGVFCRRV. KVDVASHSPQ IDPLRDEL.L 
atsoralx DSTVLAGEPD MDALLQALE RKNVFCRRV. AMDVAPHCPQ VDCLRDEL.F Benzoate-CoA 
atfkbOlp ESTWAGDPA AVERVLARYE AEGVRVRRI . AVDYASHTPH VEAIEAQL.A 
atfkbOSp ESTWAGDPS AVERVLARYE AEGVRVRRI. AVDYASHTPH VEAIQEQL.A 
atrap03p ASTVIAGTPE AVDHVLTAHE ARGVRVRRI . TVDYASHTPH VELIRDEL.L 
atrap06p ASTVIAGTPE AVDHVLTAHE ARGVRVR^I. TVDYASHTPH VELIRDEL.L 
atrap04p ASTVIAGTPE AVDHVLTAHE ARGVRVRRI. TVDYASHTPH VELIRDEL.L 
atrapl3p ASTVIAGTPE AVDHVLTAHE AQGVRVRRI- TVDYASHTPH VELIRDEL.L 
atrapOlp ASTWAGAPE AVDRVLAVHE ARGVRVRRI. AVDYASHTPH VELIRDEL.L 
atrap07p ASTWAGAPE AVDRVLAVHE ARGVRVRRI- AVDYASHTPH VELIRDEL.L 
atraplOp ASTVIAGTPE AVDHVLTALR QRGAGAAD. - HVDYASHTPH VELIRDEL.L 
atfkb04x ATTIVSGRPD AVETLIADYE ARGVWVTRL. WDCPTHTPF VDPLYDEL.Q C5 unit 
attyl04p ASTWSGDRR AVAGYVAVCQ AEGVQARLIP .VDYASHSRH VEDLKGELE. 
attyl06p ASTWSGDRR AVAGYVAVCQ AEGVQARLIP .VDYASHSRH VEDLKGELE. 
attylOlp ASTWSGDRR AVAGYVAVCQ AEGVQARLIP .VDYASHSRH VEDLKGELE. 
attyl02p ASTWSGDRR AVAGYVAVCQ AEGVQARLIP .VDYASHSRH VEDLKGELE. 
attylOOp ASTWSGDRR AVAGYVAVCQ AEGVQARLIP .VDYASHSRH VEDLKGELE. 
atnidOSb GSCAVAGDPE ALAELVALLT GEGVHARPIP GVDTAGHSPQ VDALRAHL.L Etmalonyl-Coft 
attylOSb GTAAVAGDVD ALRELLAELT AEGIRAKPIP GVDTAGHSAQ VDGLKEHL.F Etinalonyl"<b^ 
atnid06x ASVTVSGDAL ALEEFGARLS AEGVLRWPLP GVDFAGHSPQ VEBFRAEL.L MeOmaXonylO^n 
atdebsOlp RSVWAGDSD ELDRLVASCT TECIRAKRL. AVDYASHSSH VETIRDALHA 
atmon02p SSTVISGPPE HVAAWADAE ARGLRARVID .VGYASHGPQ IDQLHDLL.T 
atmonlOp SSTVISGPPE HVAAWADAE ARGLRARVID .VGYASHGPQ IDQLHDLL.T 
atmon04p SSTVISGPPE HVAAWAEAE ARGLRARVID .VGYASHGPQ IDQLHDLL.T 
atmon07p SSTVISGPPE HVAAWADAE ERGLRARVID .VGYASHGPQ IDQLHDLL.T 
atmonllp SSTVISGPPE HVAAWADAE AQGLRARVID .VRYASHGPQ IDQLHDLL.T 
atmonl2p SSTVISGPPE HVAAWADAE ARGLRARVID .VGYASHGPQ IDQLHDLL.T 
atmonOSb SSTVISGPPE GIAAWADAQ ERGLRARAVA .SDVAGHGPQ LDAILDQL.T Bt/mal-CoA 
atmonOlp SSTVISGPPE QVAAWADAE ARELRGRVID .VDYASHSPQ VDAITDEL.T 
atdebs02p DAVWAGDAQ AAREFLEYCE GVGIRARAIP .VDYASHTAH VEPVRDEL.V 
atdebs06p SSVWSGDPE ALAELVARCE DBGVRAKTLP .VDYASHSRH VEEIRETI.L 
ataveOlp RSTAVSGDAE AVDEVLAYCA GTGVRARRIP .VDYASHCPH VQPLREEL.L 
atave07p RSTAVSGDAE AVDEVLAYCA GTGVRARRIP .VDYASHCPH VQPLREEL.L 
atave06p HSTTVSGDTK AVDEVLAHCT DTGLRAKRIP .VDYASHCPH VQPLHDEL.L 
atave09p HSTTVSGDTT AVEELLTHCA DTGLRAKRIP .VDYASHCPH VQPLHDEL.L 
atnysOlp RSVWAGEPE ALDALHARLT ADDIRARRIA .VDYASHSHQ VEDLHEEL.L 
atnysllp RSVWAGEPE ALDALHARLT ADDIRARRIA .VDYASHSHQ VEDLHEEL.L 

atrifOSp ASWIAGDAE ALTEAVEVLG G RRVA .VDYASHTRH VEDIQDTL.A 

atrif07p ASWIAGDAQ ALDEALEVLA GDGVRVRQVA .VDYASHTRH VEDIRDTL.A 
atrifOBp SSWIAGDAE ALDQALEALT GQDIRVRRVA .VDYASHTRH VEDIQEPL.A 
atriflOp ASWIAGDAQ ALDETLEALS GAGIRARRVA .VDYASHTRH VEDIEDTL.A 
atrif03p SSWIAGDAQ ALDEALEALA GDGVRVRRVA .VDYASHTRH VEAIAETL.A 
atrif06p ASWIAGEAQ ALDEWDALS GQEVRVRRVA .VDYGSHTNQ VEAIBDLL.A 
atrif04p TSWIAGDAE ALDEALDALD DQGVRIRRVA .VDYASHTRH VEAARDAL.A 
atrifOlp SSWIAGDAH ALDATLEILS GEGIRVRRVA .VDYASHTRH VEDIRDTL.A 
atnys02p SSVWSGDTD ALDALHTACQ EQGVRARKVS .VDYASHGRH VEAVRDEL.A 

atfkb02p ASIWAGAAD AVEELLAATP HARRIA .VDYASHTAH VESIRGAL.L 

atavellp RSAWSGEPE AVD7VLVEELS HEDVPARRLM .VDWASHSPQ VEAIQGRL.L 
atdebs03p RSVWSGEPG ALRAFSEDCA AEGIRVRDID .VDYASHSPQ lERVREEL.L 
atnid04p ETTWCGAPG AVDSLLGVLQ GEGVRVRRID .VDYASHSRH VEGIRDEL.A 
atdebsOSp RSVWAGESG PLDELIAECE AEGITARRIP .VDYASHSPQ VESLREEL.L 
atdebs04p GTSWAGPTA ELDEFFAEAE AREMKPRRIA .VRYASHSPE VARIEDRL.A 
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atave02a 
ataveOSa 
atave04a 
ataveOSa 
atave03a 
atrap02a 
atraplla 
atrapOSa 
atrapl2a 
atrapOSa 
atrap09a 
atfkb03a 
atfkb07x 
atfkbOSx 
atnidOla 
atnid03a 
atnid02a 
atnidOOa 
atfkblOa 
atrapl4a 
atiaon06a 
atmonOSa 
atmon09a 
atepo02a 
atepoOSx 
atepoOSa 
atepoOOa 
atepo04a 
atnid07a 
attyl07a 
atsor02a 
atsorbla 
atnys09a 
atnysl2a 
atnysl6a 
atnysl7a 
atnys03a 
atnyslBa 
atnys07a 
atnysOSa 
atnys05a 
atnys06a 
atnys04a 
atnysl4a 
atnysOOa 
atnyslOa 
atnyslSa 
atnysl3a 
atavelOa 
atrif02a 
atmon03a 
atavel2a 
atrif09a 
atmonOOa 
attylOSa 



TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL, PTNHAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTKNAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 

SSWLSGDEA.AVLQAAEGLG KWTRL. PTSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHS7VR MEPMLEEFR. 

SSWLSGDET AVLQAAAALG KSTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

ASIVLSGDED AVLDVAARLG RFTRL. RTSHAFHS7\R MEPMLDEFR. 

HSWLSGDEG PVLDVAQQLG IHHRL. PTRHAGHSAR MDPLVAPLL. MeOmalonyl-CoA 

HSWLSGDED AVLDVAQRLG IHHRL. PAPHAGHSAH MEPVAAELL. MeOmalonyl-CoA 

THCVLSGPRT ALEETAQQLH QQGIRHTWL. KVSHAFHSAL MDPMLGAFR. 
THCVLSGPRT ALEETAQHLR EQNVRHTWL . KVSHAFHSAL MDPMLGAFR. 
THCVLSGPRT ALEETAQHLR EQNVRHTWL. KVSHAFHSAL MDPMLGAFR. 
THCVLSGPRT ALEETAQHLR EQNVRHTWL. KVSHAFHSAL MDPMLGAFR. 
SAWLTGAPD DVAAFEREWA AAGRRAKRL. DVGHAFHSRH VDGALDDFR. 
EAVWSGEPE PVADFEAAWT ASGREARKL. KVRHAPHSRH VEAVLDEFR. 
DSTVISGPSD EVDRIAGVWR BRGRKTKAL, SVSHAFHSAL MEPMLAEFT. 
DSTVISGPSG EVDRIAGVWR BRGRKTKAL. SVSHAFHSAL MEPMLAEFT. 
DSTVISGPSG EVDRIAGVWR ERGRKTiCAL. SVSHAFHSAL MEPMLGEFT . 
EQWIAGVEQ AVQAIAAGFA ARGARTKRL. HVSHAFHSPL MEPMLEEFG. 
EQWIAGVEQ AVQAIAAGFA ARGARTKRL. HVSHASHSPL MEPMLEEFG, Mal/nmal 
EQWIAGAEK FVQQIAAAFA ARGARTKPL. HVSHAFHSPL MDPMLEAFR. 
DQWIAGAGQ PVHAIAAAMA ARGARTKAL. HVSHAFHSPL MAPMLEAFG. 
DAWIAGAEV QVLALGATFA ARGIRTKRL. AVSHAFHSPL MDPMLEDFQ. 
DSLVLSGDEQ AWSAAGELA ARGRRTKRL- SVSHAFHSPH MDAMLADFR. 
RSWISGAEE AVAEAAAQLA GRGRRTRRL, RVAHAFHSPL MDGMLAGFR. 
LSTWAGDED AWEIARQAE ALGRKTTRL. RVSHAFHSPH MDGMLDDFR. 
LSTWAGDED AVLKIARQVE ALGRKATRL. RVSHAFHSPH MDGMLDDFR. 
SAWLSGAEA TVTALAEQLA ADGRKTRRL. RVSHAFHSPL MEPMLDAFR. 
RSVWAGVEE DVLLLADLFA ADGRRTKRL. RVSHAFHSPL MDAMLDDFA. 
VSVWSGVEA AVGQWDQLV ERGRRVRRL. AVSHAFHSPL MDPMLDAPR. 
VSVWSGVEA AVGQWDQLV ERGRRVRRL. AVSHAFHSPL MDPMLDAFR. 
TSVWAGTEE AVAAIGARFT AQDRKTTRL. RVSHAFHSPL MDPMLAEFR. 
TSLWSGDET ATLAVAARLA EQGRRTTRL. RVSHAFHSPL MDPMLAEFR. 
NALWSGVED AAVEIGARFA AEGRRTTRL. HVSHAFHSPL MDPMLAEFR. 
TSWISGAEE ATQTVAQHFA DQGRRTTAL. RVSHAFHSPL MDPMLAEFR. 
TSVWSGAES AARTVADRLA ENGRKTTBL. RVSHAFHSPL MDPMLAEFR. 
TSWISGME ATQTVAQHFA DQGRRTTAL. RVSHAFHSPL M. .MLAEFR. 
TSVWSGYEN ATLAVARHFA DQGRRTTRL. RVSHAFHSPL MAPMLDDFR, 
TAWLSGAGD AVTALGQALA ERGHRTTRL. RVSHAFHSHL MDPMLADFR. 
TAVWAGAED AVRQLTARFA DRGRRTSRL. AVSHAFHSPL MEPMLDAFR. 
QSWISGDEE AAETIAATFA ERGRKTKRL. RVSHAFHSPR MDGMLDAFR. 
RSVWAGAED AALAVRRHFD DLGRRTTRL. PVSHAFHSPL MDPMLDAFR. 
RSVWAGAED AVRAVADRLA ADGRRTRRL. TVSHAFHSPL MDPMLTDFA. 
RSIVLSGDED AVLDLAQQWA ARGRRTRRL. RTSHAFHSPH MDAMLGDFR. 
DSWLSGTEA AVLAVADELA GRGRKTRRL. AVSHAFHSPL MEPMLDDFR. 
TAVWSGEAA AVGEVBKALR GRGLKTKRL. NVSHAFHSPL lEPMLDDFR. 
ASWFSGAED EVGNMADWFA ERGRRVKRL. RTGHAFHSPL MDPMLEEFQ. 
SAWLSGDAD AWAAAARMR ERGHKTKQL. KVSHAFHSAR MAPMLAEPA. 
DSVWSGDRA TVDELTAAWR GRGRKAHHL. KVSHAFHSPH MDPILDELR. 
SSVWSGDEA AVLELLEQWR AEGREARRL. AVSHAFHSPR MDGMLTQFD. 



HAFH/YASH/TA6H motif 
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351 400 
ataveOOx SGLLPITPRP SRIPFHSSVT G GRL. .DTRELDAAY WYRNMSSTVR 

atdebsOOp SALAWFAPGG SEVPFYASLT G GAV. .DTRELVADY WRRSFRLPVR 

atepo06p AALGAIRPRA AAVPMRSTVT G GVI, .AGPEIiGASY WADNLRQPVR 

atepo07p AALGAIRPRA AAVPMRSTVT G GVI. .AGPELGASY WADNLRQPVR 

atepoOlp AALGGLRPGA AAVPMRSTVT G.....AMV, .AGPELGANY WMNNLRQPVR 

atepoOSp AALGELEPRQ ATVSMRSTVT S TIM. .AGPELVASY WADNVRQPVR 

atsoralx DALREVRPNK AQIPIVSEVT G TAL. .DGERFDASH WVRNFGDPAL 

atfkbOlp DALEGITSST PSVPWWSTVD S GWV. . .TEPFGDAY WYRNLRQPVA 

atfkbOSp DVLGDITSSA PSVPWWSTVD G GWV. ..TEPAGDDY WYRNLRQPVA 

atrap03p DITSDSSSQA PLVPWLSTVD G SWV. ..DSPLDGEY WYRNLREPVG 

atrapOSp DITSDSSSQA PWPWLSTVD G *,SWV. ..DSPLDVEY WYRNLREPVG 

atrap04p GITAGIGSQP PWPWLSTVD G SWV. ..DSPLDGEY WYRNLREPVG 

atraplSp DITSDSSSQT PLVPWLSTVD G-. . , . .TWV. . .DSPLDGEY WYRNLREPVG 

atrapOlp GVIAGVDSRA PWPWLSTVD G TWV. . .EGPLDAEY WYRNLREPVG 

atrap07p DITAGIGSQA PWPWLSTVD G TWV. . .EGPLDVEY WYRNLREPVG 

atraplOp DITSDSSSQD PLVPWLSTVD G TWV. ..DSPLDGEY WYRNLREPVG 

atfkb04x RIVAATTSRA PEIPWFSTAD E RWI. ..DAPLDDEY WFRNMRNPVG 

attyl04p RVLSGIRPRS PRVPVCSTVA G E..Q PGEPVFDAGY WFRNLRNRVE 

attyl06p RVLSGIRPRS PRVPVCSTVA G E..Q PGEPVFDAGY WFRNLRNRVE 

attylOlp RVLSGIRPRS PRVPVCSTVA G E- .Q PGEPVFDAGY WFRNLRNRVE 

attyl02p RVLSGIRPRS PRVPVCSTVA G E. .Q PGEPVFDAGY WFRNLRNRVE 

attylOOp RVLSGIRPRS PRVPVCSTVA G E,.Q PGEPVFDAGY WFRNLRNRVE 

atnidOSb EVLAPVAPRP ADIPFYSTVT G,:...GLL. , DGTELDATY WYRNMREPVE 

attylOSb EVLAPVSPRS SDIPFYSTVT G APL. .DTERLDAGY WYRNMREPVE 

atnid06x DLLSGVRPAP SRIPFFSTVT A GPC. .GGDQLDGAY WYRNTREPVE 

atdebsOlp ELGEDFHPLP GFVPFFSTVT G RWT. .QPDELDAGY WYRNLIUITVR 

atmon02p ERLADIRPTN TDVAFYSTVT A ERL. TDTTMDTDY WVTNLRQPVR 

atmonlOp ERLADIRPAN TDVAFYSTVT A ERL. TDTTALDTDY WVTNLRQPVR 

atmon04p EGLADIRPAN TDVAFYSTVT A ERL. TDTTALDTDY WVTNLRQPVR 

atmon07p DRLADIRPAT TDVAFYSTVT A ERL. TDTTALDTDY WVTNLRQPVR 

atmonllp DRLADIQPTT TDVAFYSTVT A ERL. DDTTALDTAY WVTNLRQPVR 

atmonl2p ERLADIRPTT TDVAFYSTVT A ERL. DDTTTLDTDY WVTNLRQPVR 

atmonOSb EGLAGIRPAA TDVAFYSTVT A GHL. TDTTELDTAY WVRNVRRTVR 

atmonOlp HTLSGVRPTT APVAFYSAVT G TRI. .DTAGLDTDY WVTNLRRPVR 

atdebs02p QALAGITPRR AEVPFFSTLT G D..F LDGTELDAGY WYRNLRHPVE 

atdebs06p ADLDGISARR AAIPLYSTLH G E. .R RD. . .MGPRY WYDNLRSQVR 

ataveOlp ELLGDISPQP SGVPFFSTVE G TW LDTTTLDAAY WYRNLHQPVR 

atave07p ELLGDISPQP SGVPFFSTVE G TW LDTTTLDAAY WYRNLHQPVR 

atave06p HLLGDITPQP STVPFFSTVE G TW LDTTTLDAAY WYRNLHQPVR 

atave09p HLLGDITPQP STMPFFSTW G HLVW Y.TTTLDAAY WYRNLHQPVR 

atnysOlp EVLAELAPRT SEVPFFSTVT G DWL. .DTARMDAGY WFRNLRGRVR 

atnysllp EVLAELAPRT SEVPFFSTVT G DWL. .DTARMDAGY WFRNLRGRVR 

atrifOSp BTLAGIDAQA PWPFYSTVA G EWI. TDAGWDGGY WYRNLRNQVG 

atrif07p BTLAGITAQA PDVPFRSTVT G GWV. RDADVLDGGY WYRNLRNQVR 

atrif08p EALAGIEAHA PTLPFFSTLT G DWI. REAGWDGGY WYRNLRNQVG 

atriflOp BALAGIDARA PLVPFLSTLT G EWI. RDEGWDGGY WYRNLRGRVR 

atrif03p KTLAGIDARV PAIPFYSTVL G TWI. EQA.WDAGY WYRNLRQQVR 

atrif06p ETLAGIEAQA PKVPFYSTLI G DWI. RDAGIVDGGY WYRNLRNQVG 

atrif04p EMLGGIRAQA PEVPFYSTVT G GWV. EDAGVLDGGY WYRNLRRQVR 

atrifOlp ETLAGISAQA PAVPFYSTVT S EWV. RDAGVLDGGY WYRNLRNQVR 

atnys02p RVLAPVDPRA PEVPFYSTVT G DRV. DDAA.FDGAY WYTNLRQTVR 

atfkb02p DALADLTPGA PEIPFFSTVD E AWL. DRPA. .DAAY WYDNVRCPVR 

atavellp ELLAPIRART GDVPFYSTVT G ERI. .DGTELDADY WYRNLRQWR 

atdebs03p ETTGDIAPRP ARVTFHSTVE S RSM. .DGTELDARY WYRNLRETVR 

atnid04p AVLAGLRPRA GRVPFYSTVE A EPL. .DGTALDAGY WYRNLRQRVR 

atdebsOSp TELAGISPVS ADVALYSTTT G QPI. .DTATMDTAY WYANLREQVR 

atdebs04p AELGTITAVR GSVPLHSTVT G EVI. .DTSAMDASY WYRNLRRPVL 
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atave02a QHTQTLTYHP PHTPLITANT PPDQLLTPHY WTQQARNTVD 

ataveOSa QHTQTLTYHP PHTPLITANT PPDQLLTPHY WTQQARNTVD 

atave04a QHTQTLTYHP PHTPLITANT PPDQLLTPHY WTQQARNTVD 

atave08a QHTQTLTYHP PHTPLITANT PPDQLLTPHY WTQQARNTVD 

atave03a QHTQTLTYHP PHTPLITANT PPDQLLTPHY WTQQARNTVD 

atrap02a AVAEGLTYRT PQVA MA AGDQVMTAEY WVRQVRDTVR 

atraplla AVAEGLTYRT PQVS MA VGDQVTTAEY WVRQVRDTVR 

atrapOSa AVAEGLTYRT PQVS MA AGDQLTTTEY WVRQVRDTVR 

atrapl2a AVAEGLTYRT PQVS MA VGDQVTTAEY WVRQVRDTVR 

atrapOSa TVAERLTYQT PRLA MA AGDRVTTAEY WVRQVRDTVR 

atrap09a AVAQGLTYHA PGW MA AGDRVMTAEY WVRQVRDTVR 

atfkb03a DVAERLTYHE PKLP ' MA AGADCATPEY WVRQVRDTVR 

atfkbOVx EAASGLTYHQ PHT A IPEDPTTAAY WARQVRDQVR 

atfkbOSx ATTRELRYDR PHT..* A IPNDPTTAEY WAEQVRNPVL 

atnidOla DTLNTLNYQP PTIPLXSNLT GQIADPNHL CTPDY WIDHARHTVR 

atnid03a DTLNTLNYQP PTIPLISNLT GQIADPNHL CTPDY WIDHARHTVR 

atnid02a DTLNTLNYQP PTIPLISNLT GQIADPNHL CTPDY WIDHARHTVR 

atnidOOa DTLNTLNYQP PTIPLISNLT GQIADPNHL CTPDY WIDHARHTVR 

atfkblOa GVLESLAFGA ARLPWSTTT GRDAAGD.LA TPEH WLRHARRPVL 

atrapl4a TALESLKFRA PALPWSTVT GRLIDQDEMG TPEY WLRQVRRPVR 

atmon06a EAIRGVKFRQ PSIPLMSNVS GERA. GEEITDPEY WARHVRNAVL 

atmon08a EAIREVKFTR PKVSLISNVS GLEA GEEIASPEY WARHVRQTVL 

atmon09a EAIRGVKFRQ PSIPLMSNVS GERA GEEITSPEY WARHVRQTVL 

atepo02a RVAASVTYRR PSVSLVSNLS GKWT.DEL SAPGY WVRHVREAVR 

atepoOSx RVAASVTYRR PSVSLVSNLS GKWA.DEL SAPGY WVRHVREAVR 

atepo08a RVTESVTYRR PSIALVSNLS GKPCT.DEV SAPGY WVRHAREAVR 

atepoOOa RVAESVSYRR PSIVLVSNLS GBCACT.DEV SSPGY WVRHAREWR 

atepo04a RVAATIAYRA PDRPWSNVT GHVAG.PEI ATPEY WVRHVRSAVR 

atnid07a AVAESVTYRT PRLPIVSEVT GRPAAPSEL MDPGY WTRQIREPVR 

attyl07a EVAAGLRYRE PELTWSTVT GRP2\RPGEL TGPDY WVAQVREPVR 

atsor02a RVAQSLTYHP ARIPIISNVT GARATDHEL. . ... . .ASPDY WVRHVRHTVR 

atsorbla RVAQGLTFHP ARIPIISNVT GARATDQEL. . . . . .ASPET WVRHVRDTVR 

atny309a AWEDLTLQP PLLPWSNLT GKPATVAQL TSADY WVDHVRHAVR 

atnysl2a AVARGLTYHP PTIPFVSNVS GGLATAEQV RTPDY WVGHVRAAVR 

atnysl6a AVAEGLEYHQ PRIPWSNVT GEVAAAEEL CAADY WVRHVRATVR 

atnysl7a AVAEGLEYHQ PRIPWSNVT GEVAAAEEL. '.•...CAADY WVRHVRATVR 

atnys03a AVAAGLTYHE PRIPVLSNLT GTVAAVADL CSADY WVRHVREAVR 

atnyslSa AVAEGLSYGE PQIPWSNLT GAVADGTLL GTADY WVRHVREAVR 

atnys07a WAEGLSYAA PSLPWSNLT GQVATADEL CSAEY WVRHVREAVR 

atnysOSa AVAEGLSYAT PSLPWSNLT GWLATADEL CSAEY WVRHVREAVR 

atnysOSa AVAEGLSYAT PTLPWSNLT GRLATADDL CSAEY WARHVREAVR 

atnys06a AVAEGLSYAT PTLPWSNLT GQVATADEL CSAEY WVRHVREAVR 

atnys04a AWESLTFTA PTTPWSNLT GELAPAEAL CSADY WVRHVREAVR 

atnysl4a TVAEGLEYHP PRIPWSNLT GDVADAADL CSADY WVRHVRGTVR 

atnysOOa DWSRLTFHQ PSIPLVSNLT GELA.GSEI TSAEY WVRHVRDTVR 

atnyslOa IVAEGLTYRA PRIPLVSDLT GRRADDAEV CTAEY WVRHVREAVR 

atnyslSa TALAPLTFAB PEIPWSNLT GLPATAEEL ATPHY WVCHVRQAVR 

atnysl3a RVAEGLTYHE PRIPLVSTLL GAPAGA.EL RTPDY WVRHVRETVR 

atavelOa RAAEQVTPSA PRIPWSNVT GAPLPAETM CTPDY WVEHARSTVR 

atrif02a AVAERLTYRA GSLPWSTLT GELAA. ..L DSPDY WVGQVRNAVR 

atmonOSa EVARGLTFHA PTLPWSNLT GRLADAELM ADAEY WVRHVRRPVR 

atavel2a QVAASLTYSE PAIPMVSTLT GDIVAAGEL SDPEY WVRQVRRTVR 

atrif09a AELAGVTWRE PEIPWSNVT GRFAEPGEL TEPGY WAEHVRRPVR 

atmonOOa AVAAGLTFHE PVIPWSNVT GELVTATATG SGAGQADPBY WARHAREPVR 

attyl03a RVARTLTFAP PTIPLVSTLT GTPVTEETL CTADH WVRQARBPVR 
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ataveOOx 
atdebs.OOp 
atepo06p 
atepoOVp 
atepoOlp 
atepoOSp 
atsoralx 
■atfkbOlp 
atfkbOSp 
atrap03p 
atrap06p 
atrap04p 
atrapl3p 
atrapOlp 
atrap07p 
atraplOp 
atfkb04x 
attyl04p 
attyl06p 
attylOlp 
attyl02p 
attylOOp 
atnidOSb 
attylOSb 
atnid06x 
atdebsOlp 
atinon02p 
atmonlOp 
atmon04p 
atmon07p 
atmonllp 
atmoiil2p 
atmonOSb 
atzDonOlp 
atdebs02p 
atdebs06p 
ataveOlp 
ataveOTp 
ataveOSp 
atave09p 
atnysOlp 
at ny slip 
atrifOSp 
atrif07p 
atrif08p 
atriflOp 
atrif03p 
atrif06p 
atrif04p 
atrifOlp 
atnys02p 
at£kb02p 
atavellp 
atdebsOBp 
atnid04p 
atdebsOSp 
atdebs04p 



401 

FEPAARLLLQ 
FDEAIRSALE 
FAAAAQALLB 
FAAAAQALLE 
FABWQAQLQ 
FAEAVQSLME 
FSTAIDHLLQ 
MDTAVSELDG 
MDTAI6ELDG 
FHPAV6QLQA 
FHPAVGQLQA 
FHPAVSQLQA 
FHPAVSQLQA 
FEPAAGQLQA 
FDSAVGQLRA 
FHPAVSQLQA 
FAAAVAAARE 
FSAWGGLLE 
FSAWGGLLE 
FSAWGGLLE 
FSAWGGLLE 
FSAWGGLLE 
FERATRALIA 
FEKAVRALIA 
FDATVRALLR 
FADAVRALAE 
FADTIEALLA 
FADTIEALLA 
FADTIEALLA 
FADTIDALLA 
FADTIEALIA 
FADTIEALLA 
FADTIDALLA 
FADAVTALLA 
FHSAVQALTD 
FDEAVSAQSP 
FSDAVQALAD 
FSDAVQALAD 
FSHAIQTLTD 
FSHAIQTLTD 
FADAVADLLA 
FADAVADLLA 
FGPAVAELIE 
F6PAVAELLE 
FGPAVAELLG 
FGPAVEALLA 
FGPSVADLAG 
FGPAVAELVR 
FGPAVAELIE 
FGAAATALLE 
MEEATRALLA 
FGAAAARLAE 
FRDATQALVR 
FADAVTRLAE 
FESALRAMLA 
FQDATRQLAE 
FEQAVRGLVE 



18/26 



Q6P. 
VGP. 
GGP, 
GGP< 
GGH, 
DGH, 
EGF, 



QGD. 
EGD. 
QGD, 
QGD. 
QGD, 
EGD, 
QGD. 
PGD, 
EGH. 
EGH. 
EGH. 
QGH. 
EGH, 
DGH, 
DGY, 
AGH. 
QGY. 
DGY. 
DGY. 
DGY, 
DGY, 
DGY 
DGY, 
DGY, 
DGH, 
QGY 
DGH 
DGH, 
DGH, 
DGH 
DGH, 
AEY, 
AEY, 
QGH 
QGH 
LGH 
QGH 
LGH 
QGH 
QGH 
QGH 
AGH 
LGH 
AGH 
SGY 
DGV 
AGF 
QGF 



450 

KTFVEM SPHPVLTMGL QELAPDLG DTTG 

GTFVBA SPHPVLAAAL QQTL DAEG 

ALFIEM SPHPILVPPL DEIQTA AE 

ALFIEM SPHPILVPPL DEIQTA AE 

GLFVEM SPHPILTTSV EEMRRA AQ 

GLFVEM SPHPILTTSV EEIRRA TK 

DIFLEL TPHPLALPAI ESNLRR SG 

SLFIEC SAHPVLLPAL DQ 

SLPIEC SAHPVLLPAL DQ 

TVFVEV SASPVLLQAM DD 

TVFVEV SASPVLLQAM DD 

AVFVEV SASPVLLQAM DD 

TVFVEV SASPVLLQAM DD 

TVFVEV SASPVLLQAM DD 

TVFVEV SASPVLLQAM DD 

TVFVEV SASPVLMQAM DD 

TVFIEV SAHPVLLPAI NG 

RRFIEV SAHPVLVHAI EQT. . . .A 1 . .EAAD 

RRFIEV SAHPVLVHAI EQT A EAAD 

RRFIEV SAHPVLVHAI EQT A EAAD 

RRFIEV SAHPVLVHAI EQT A EAAD 

RRFIEV SAHPVLVHAI EQT. . . .A EAAD 

DVFLET SPHPMLAVAL EQT V TDAG 

DLFLEC NPHPMLAMSL DET L TDSG 

HTEIEV GPHPLLNAAI DEI A ADEG 

RTFLEV SAHPILTAAI EEI G DGSG 

RLFIEA SAHPVLGLGM EETIEQ AD 

RLFIEA SAHPVLGLGM EETIEQ AD 

RLFIEA SAHPVLGLGM EETIEQ AD 

RLFIEA SAHPVLGLGM EETIEQ AD 

RLFIEA SPHPVLNLGI QETIEQQA GAA 

RLFIEA SPHPVLNLGM EETIER AD 

RLFIEV SPHPVLNL2\L EGLIER AA 

RVFIEA SSHPVLTLGL QETFEE AG 

ATFIEV SPHPVLASSV QETL DDAE 

ATFVEM SPHPVLTAAV QE lA 

RVFVEV SPHPTLVPAI EDTTEDTA ED.. 

RVFVEV SPHPTLVPAI EDTTEDTA ED.. 

RAFIEI SPHPTLVPAI EDTTENTT ..EN.. 

RPFIEI SPHPTLVPAI EDTTENTT EN.. 

RAFVEV SSHPVLTMAV LD LI EEAG 

RAFVEV SSHPVLSMAV QE AI DEAG 

GVFVEV SAHPVLVQPI SE LT D . . . 

GVFVEV SAHPVLVQPI SE...,LT D. . . 

RVFVEV SAHPVLVQAI SA lA DD. . 

GVFVEL SAHPVLVQPI TE LT DE. . 

TVFVEI SAHPVLVQPL SE IS DD . . 

GVFVEV SAHPVLVQPL SE LS DD. . 

RVFVEV SAHPVLVQPI NE LV DD. . 

TVFVEV SAHPVTVQPL SE LT GD. . 

RVFIEV SPHPVLAAPI QETQEAVA EATG 

RVFVEA SPHPVLTTAL ADTLAG H 

TVFIEA CPHPAVAVGV QETLDE.M GD 

,DAFIEV SPHPVWQAV EEAVEE.A DGAE 

,DAFVEC SPHPVLTVPV RQTLED.A ' GA. 

DAFVEV SPHPVLTVGI EATLDS.A LPAD 

DTFVEV SPHPVLLMAV EET A EHAG 
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atave02a 
ataveOSa 
atave04a 
ataveOSa 
atave03a 
atrap02a 
atraplla 
atrapOSa 
atrapl2a 
atrapOSa 
atrapOSa 
atfkb03a 
atfkb07x 
atfkbOex 
atnidOla 
atnid03a 
atnid02a 
atnidOOa 
atfkblOa 
atrapl4a 
atmon06a 
atxnonOSa 
atmonOda 
atepo02a 
atepoOSx 
atepoOSa 
atepoOOa 
atepo04a 
atnid07a 
attyl07a 
atsor02a 
atsorbla 
atnys09a 
atnysl2a 
atnysl6a 
atnyslVa 
atnysOSa 
atnyslSa 
atnysOVa 
atnys08a 
atnysOSa 
atnys06a 
atnys04a 
atnysl4a 
atnysCOa 
atnyslOa 
atnyslSa 
atnyslSa 
atavelOa 
atrif02a 
atmonOSa 
atavel2a 
atrifOSa 
atmonOOa 
attylOSa 



YATTTQTLHQ HG.VTT.YIBIi GPDNTLTTLT HHNLPNPPTT TLTLTHPHHH 
YATTTQTLHQ HG.VTTYIEL GPDNTLTTLT HHNLPNTPTT TLTLTHPHHH 
YATTTQTLHQ HG.VTTYIEL GPDNTLTTLT HHNLPNTPTT TLTLTHPHHH 
lATTTQTLHQ HG.VTTYIEL GPDNTLTTLT HHNLPNTPTT TLTLTHPHHH 
YATTTQTLHQ HG.VTTYIEL GPDNTLTTLT HDNLPNTPTT TLTLTHPHHH 

FGEQVASFED A VFVEL GADRSLARLV DG 

FGEQVASYED A. • • .VFVEL GADRSLARLV DG *. , . . 

FGEQVASYED A VFVEL GADRSLARLV DG 

FGEQVASYED A VFVEL GADRSLARLV DG 

FGEQVASYED A VFIEL GADRSLARLV DG 

FGEQVASYED A VFVEL GADRSLARLV DG 

FAEQVAAYDG A ALLEI GPDRNLARLV DG 

FQAHAERYPG A TFLEI GPNQDLSPW DG 

FHAHTQRYPD A VFVEI GPGQDLSPLV DG 

FADAVQTAHD QR.TTTYLEI GAHPTLTTLL HHTLDNP 

FADAVQTAHH QG-TTTYLEI GPHPTLTTLL HHTLDNP 

FADAVQTAHD QR.TTTYLEI GPHPTLTTLL HHTLDNP 

FADAVQTAHH QG.TTTYLEI GPHPTLTTLL HHTLDNP 

YADAVRELAD LG.VNMFVAV GPSGALASAA SENTGGSAGT YH 

FQDAVRELAE QG.VGTFVEV GPS GALAS AG VECLGGDA.S FH 

FQPAIAQVAD S..AGVFVEL GPAPVLTTAA QHTLDE.SD. . SQES 

FQPGIAQVAS T..AGVFVEL GPGPVLTTAA QHTLDDVTDR HGPEP 

FQPGVAQVAA E..ARAFVEL GPGPVLTAAA QHTLDHITEP EGPEP 

FADGVKALHE AG.AGTFVEV GPKPTLLGLL PACLPEAEP 

FADGVKALHE AG.AGTFVEV GPKPTLLGLL PACLPEAEP 

FADGVKALHA AG.AGLFVEV GPKPTLLGLV PACLPDARP 

FADGVKZ^HA AG.AGTFVEV GPKSTLLGLV PACMPDARP 

FGDGAKALHA AG.AATFVEV GPKPVLLGLL PACLGEADA 

FAAAVRAARA AG.AATFVEL GPDAVLSGMA RECAAG DTGT 

FADAVRTAHR LG.ARTFLET GPDGVLCGMA EECLED DTVA 

FLDGVRALHA EG.ARVFLEL GPHAVLSALA QDALGQ ...D.EGTS 

FLDGVRTLHA EG.ARAFLEL GPHPVLSALA QDALGH D.EGPS 

FADGIDWLA- RHDTTAFLEL GPDGVLSAMA QDCLDA A. DAD. 

FADGIDWLAT QGDVHTFLEL GPDGVLSAMA RESLTD P.SRT, 

FADGVRTLAE RG.ATAFLEI GPDGVLSALA RGVL P.AEA. 

FADGVRTLAE RG.ATAFLEI GPDGVLSALA AACL.F D.TDA. 

FADGVTALTD RG.VTTLVEL GPDGVLSAMA QESL .P.DGA. 

FADGIRALTD AG.VGAFLEL GPDGTLAALA QQSA ---.P.0.A. 

FADGVTALEA EG.VRTFLEL GPDGVLAAMA GASL T.ESS- 

FADGITTLEA EG.VRTFLEL GPDGILSALA QQSL A.GEA. 

FADGVSTLEN EG.VTTFLEL GPDGVLSliMA QQSL T.GDA. 

FADGVTALEA EG.VRTFLEL GPDGVLAAMA RETV A. DDT. 

FADGIRTLAD RG.VTTFVEL GPDSVLSAMA QESA P. EGA. 

FADGVRTMAD RG.VHLFLEL GPDAVLSAMA RQCA P.D.A. 

PADGITALAK AG.ADVLIEL GPGGVLSAMA RDTL.G P.DST. 

FADCVRTLRD AG.ATTFLEL GS DGLLTAMA EDTL.G D.DHD. 

FGDGVRALAD RG.VRTFLEL GPDGVLSALV RENL P.EPG. 

FADGVRALHD AG.AGTFVEI GPDGVLTALT QQTLDT V.EAGA 

FADGISWLQB QG.VTTCLEI GPDGTLSALA QDSLSA P 

FSDAVTMGA QG.ASTFLEL GPGGALAAMA LGTLGG P.EQSC 

FHDGLRALSE QGWR.YLEL GPDPVLATMV QDGLPA P.AEGE 

FGDAISRLHT DG.VRTFMEL GPDGTLSALA EECLEATADS HPADD.DTGT 

FAEGVAAATE SGG.SLFVEL GPGAALTALV EET .'. . 

FLSGVRGLCE RG.VTTFVEL GPDAPLSAMA RDCFPAPADR SRPRP 

FLDAMRTLRA DG.IDTFVEL GPDGVLSAMA RDCADDRPDG DTTGAGDGET 
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ataveOOx TADTVIMGTL RRGQGTLDHF LTSLAQLRGH GE..TSATTV LSARLTALSP 

atdebsOOp SSAAW.PTL QRGQGGMRRF LLAAAQAFTG GV. .AVDWTA AYDDVGA.EP 

atepo06p QGGAAV.GSL RRGQDERATL LEALGTLWAS G..YPVSWAR LFPAGG 

atepo07p QGGAAV.GSL RRGQDERATL LEALGTLWAS G,.YPVSWAR LFPAGG 

atepoOlp RAGAAV.GSL RRGQDERPAM LEALGTLWAQ G. .YPVPWGR LFPAGG 

atepoOSp REGVAV.GSL RRGQDERLSM LEALGALWVH G. ,QAVGWER LFSAGGAGL." 

atsoralx RRGWL.PSL RRNEDERGVM LDTLGVLWR G. .APVRWDN VYPA. . .AF 

at f kbO Ip . E . RTV . ASL RTDDGGWDRF LAALAQAWTQ GA . . DVDWTT LIEPA 

atfkbOSp .B. RTV. ASL RTDDGGWDRF LTALAQAWTQ GA.. DVDWTT LIAPA 

atrap03p .DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILG T 

atrap06p .DWTV.ATL RRDDGDATRM LTALAQAYVH GV.. TVDWPA ILG T 

atrap04p .DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILG T 

atrapl3p .DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILg!t! ! ! ' 

atrapOlp .DWTV.ATL RRDDGDATRM LTALAQAYVE GV. .TVDWPA VLg!t!! ' \' 

atrapOVp .DWTV.ATL RRDDGDATRM LTALAQAFVE GV.. TVDWPA ILg!t!.'!!! 

atraplOp .DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWRA VLGDV 

atfkb04x . . .TTV.GTL RR.GGGADRV LDSLAKAHTV GV. .AVDWST WAATGAADD 

attyl04p RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW 

attyl06p RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW [./ 

attylOlp RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW. . 

attyl02p RSVHAT. GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW. . 

attylOOp RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW. . . . [ 

atnidOSb TDAAVL.GTL RRRHGGPRAL ALAVCRAFAH GVE. .VDPEA Vf/. '. '. '. " 

attylOSb GHGTVM.HTL RRQKGSAKDF 04ALCLAYVN GLE. .IDGEA Lf! !!!!!! ! 

atnid06x VAATAL.HTL QRGAGGLDRV RNAVGAAFAH GVR. .VDWNA LF. . 

atdebsOlp ADLSAI.HSL RRGDGSLADF GEALSRAFAA GVA. .VDWES VH . . . . 

atnion02p MPATW. PTL RRDHGDTTQL TRAAAHAFTA G. .ADVDWRR WF. .* 

atmonlOp IPATW. PTL RRDHGDTTQL TRAAAHAFTA G. . APVDWRR WF . . . . 

atiaon04p IPATW. PTL RRDHGDTTQL TRAAAHAFTA G. .ADVDWRR WF. ! 

atmonOVp IPATW. PTL RRDHGDTTQL TRAAAHAFTA G. .ATVDWRR WF. . . ] ! ! ' 

atmonllp GTAVTI . PTL RRDHGDTTQL TRAAAHAFTA G. .APVDWRR WF. ! 

atmonl2p MPATW. PTL RRDHGDAAQL TRAAAQAFGA G. .AEVDWTG WF !' 

atmonOSb VPATW. PTL RRDHGDTTQL ARAAAHAFAA G. .ADVDWRR WF . . . . 

atmonOlp VDAVTV.PTL RREDGGRARL ARSLAQAFGA G. .CAVRWEN WF . . . 

atdebs02p SDAAVL.GTL ERDAGDADRF LTALADAHTR GVA. .VDWEA Vl! ! 

atdebs06p ADAVAI.GSL HRDTAE.EHL lAELARAHVH GVA. .VDWRN VF. [ ] . [ 

ataveOlp . .VTAI .GSL RRGDNDTRRF LTALAHTHTT GIGTPTTWHH HY.' . [ . [ 

atave07p . .VTAI. GSL RRGDNDTRRF LTALAHTHTT GIGTPTTWHH HY [ 

ataveOSp . .ITAT.GSL RRGDNDTHRF LTALAHTHTT GIGTPTTWHH Hy! 

atave09p . . ITAT . GSL RRGDNDTHRF LTALAHTHTT GIRTPTTWHH Hy! 

atnysOlp VTAVAT.GTL RRDQGGAGRF LLSAAEVFVR GV. .DVDWAG AF. . . . . 

atnyallp VPAVAA.GTL RRDQGGTDRF LLSAAEVFVR GV. -DVDWAG hF. . . . . . . . 

atrif05p ..AWT. GTL RRDDGGVRRL LTSMAELFVR GV..PVDWAT MA 

atrif07p . .AWT. GTL RRDDGGLRRL LTSMAELFVR GV. .RVDWAT V7.. . . . . . . 

atrifOSp TDAWT.GSL RREEGGLRRL LTSMAELFVR GV..DVDWAT MV. ' 

atriflOp TAAWT.GSL RRDDGGLRRL LTSMAELFVR GV. .EVDWTS Lv! ] 

atrif 03p . .AWT. GSL RRDDGGLRRL LASAAELYVR GV. .AVDWTA Av! . . . . . . . 

atrif06p . .AWT. GSL RREDGGLRRL LTSMAELYVQ GV. .PLDWTA Vl! ! ! ! ' ! " 

atrif 04p TEAWT.GTL RREDGGLRRL LASAAELFVR GV. .TVDWSG VL. . . . . . . . 

atrif Olp . , - . AI . GTL RREDGGLRRL LASMGELFVR GI . . DVDWTA MV, ! 

atnys02p GSAWL.GSL RRDEGGPRRF LTSLAEAHTH GA. .PVDWTT TF. ' 

atf kb02p PNTAVT.GTL RRGDGGARRF TRSLAELWVR GV. . PVSW. . ......... 

atavellp LDSLW.GSL RRGEGGLRRF LMSVAELFVG GV. .AVEWSG VF. . . . . . . . 

atdebsOSp .DAVW.GSL HRDGGDLSAF LRSMATAHVS GV. .DIRWDV AL........ 

atnid04p .GAVAV.GSL RRDDGGLRRF LTSAAEAQVA GV. .PVDWAA LC. ! ! 

atdebsOSp AGACW.GTL RRDRGGLADF HTALGEAYAQ GV. .EVDWSP AF. ! 

atdebs04p AEVTCV.PTL RREQSGPHEF LRNLLRAHVH GVGADL 
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atave02a 

ataveOSa 

atave04a 

ataveOSa 

atave03a 

atrap02a 

atraplla 

atrapOSa 

atrapl2a 

atrapOSa 

atrap09a 

atfkbOSa 

atfkb07x 

atfkb08x 

atnidOla 

atnid03a 

atnid02a 

atnidOOa 

atfkblOa 

atrapl4a 

atmon06a 

atmon08a 

atmonOSa 

atepo02a 

atepo03x 

atepb08a 

atepoOOa 

atepo04a 

atnidOVa 

attyl07a 

atsor02a 

atsorbla 

atnys09a 

atnysl2a 

atnysl6a 

atnyslVa 

atnys03a 

atnyslSa 

atnys07a 

atnysOSa 

atnysOSa 

atnysOGa 

atny304a 

atny3l4a 

atnysOOa 

atnyslOa 

atnyslSa 

atnysl3a 

atavelOa 

atrif02a 

atmon03a 

atavel2a 

atrifOSa 

atmonOOa 

attyl03a 



PQTH 

PQTH 

PQTH 

PQTH 

PQTH 

lAML 

VAML 

VAML 

VAML 

VAML 

VAML 

IPVL 

IPTQ 

lALQ 

TTIPTL 

TTIPTL 

TTIPTL 

TTIPTL 

AVL 

AVL 

VLVASL 

VLVSSL 

WTASL 

TLLASL 

TLLASL 

VLLPAS 

ALLASS 

VLVPSL 

AFAAALRRGR 
LLPAIHKPGT 
PCAFL. .PTL 
PCAFL. .PTL 
.AVTL. ,PAL 
.AL.L. .PTL 
.L.VT. .PTL 
.E.W. .PAL 
.A.AV. .PLL 
.V.SV..PVL 
.L.AV..PLL 
.V.TV..PVL 
= .A,TV..PAL 
.V.TV. .PVL 
.G.TI. .PLL 
.V.W. 
.TDW. 
.AELV. 
.LVAV* 
PAVW. 
.ARAL 

EPEPWAAAL 
PQENLLIPLL 
'.AEVTCVAAL 

AAIATC 

PDPLLTLPLL 



HGD.HE. . < 
HGD.HE. . . 
HGD.HE. . . 
HGD.HE. . , 
HTD.HE. . , 

HGD. HE. . , 

HGE. DE. . 
TGTPEE . . 
NGTADE... 



.LLTNL 

LLTNL 

LLTNL 

LLTNL 

LLTNL 

. .AQAAVGAL 
. .AQAAVGAL 
. . AQAAVSAL 
. . IQAAIGAL 
. .AQAAISAL 
. .TQAAIGAL 
. .ARSAMTAL 
, .VQALHTAL 
. .VHALHTAL 



AK. 
AK, 
AK. 
AK. 
AK. 



.TT 
.TT 
.TT 
.TT 
.TT 



HREHPEPETL TTAL AT 

HREHPEPETL TTAL. . . .AT 

HREHPBPETL TTAL AT 

HRERPBPETL TQAI AA 



.PAL 
.PAL 
.PML 
.PVL 
.PLQ 
.PAL 
.ATL 



RARTGEES , 
RPRSPEDV. . 
AGERPEES . . 
AGERPEES.. 
HPDRPDDV. . 
RAGREEA. . . 
RAGREEA. . . 
RAGRDEA. . , 
RAGRDEP. - , 
RADRSEC. 

.... PEC • • t 
APHGPAA. . . 
RKGRDDA. . , 
RKGRDDA. . < 

RAGRPEE- - 

RGDRPEE. . , 

RKDRDEE. . 

RKGRPEE. . 

RKDRPEE. . 

RKDRDEE. . 

RKDRPEE. . 

RKDRGEE. . 

RKDRDEE. . 

RRNMPEE. . 

RRDRPEE. . 

RRNRDED. . 

SKGRPEE. . 

RAGRAEE. . 

RKERPEE . . 

RRDRAGD. . 

RPDQPEA. . 

RKNGABV. . 

RSKHDEG. . 

RPDSPEP. . 

RDDRPEV. . 

RRGRDEV. . 



. .AALTAV 
. .CLMTAI 
. . .AFVEAM 
, . .AFVEAM 
. . .AFAHAM 
. .A6VLEAL 
. .AGVLEAL 
, .ASALEAL 
. .ATVLEAL 
. .EWLAAL 
. .ATVLPAA 
. . PGALRAA 
. .EAFTAAL 
. .EAFTAAL 
. .HTLTTAL 
. . PALVTAV 
. .SALLAGL 
. . HTALTAA 
. .LSAVTGL 
. .PAAVAAL 
. .PAALAAL 
. .STALTAR 
. .TSALTAL 
. .RTLLTAL 
. .QAVLAAL 
. .ETLVGAV 
. .TAFAGAL 
. .LAAATAL 
. .TTVLAAL 
. .LALLEGL 
. .RSVMTAL 
. . PDVLTAL 
. .RTLLGAV 
. . GTLLTGL 
. .TALITAV 
. .ATFLRSL 



T. 
T. 
T. 
T. 
T. 
V. 
V. 
V. 
V. 
V. 
V. 
V. 
V. 
L. 
T, 
T- 
T. 
A. 
V. 
I. 
V. 
V. 
V. 
V, 
V, 
V. 
V. 
L. 



RRSVPETGDA EHPGGFERAL 



ahlyvng.vs 
ahlyvng.vs 
ahlyvng.vt 
ahlyvng.vt 
ahlyvng.vt 
ahlyvng.vt 
arlhtgg.va 
arlhtrg.gv 
arlftrg.at 
. .lhttghtt 
..lhttghtt 
. .lhttghtt 
vgvrtdgidw 
aelhahg.ap 
aelhagg.ta 
arlhtag.va 
arlhtag.va 
7udlhvag.is 
grlwaaggs. 
6rlwaaggs. 
ggfwwggs. 
gglwavggl. 
gawyawgga. 
atafvqg.ah v 
aaaygrg.ar v 
galhaag.lt P 
galhaag.lt p 
aglhvhg.at l 
aaahahg.ar v 
arlhvag.vt v 
aqlhvag.vd 
arahvrg.vt 
arlhtag.vp 
aqlhiag.ar 
ahlhtrg.li 
ahlhtag.lr v 
grlhttg.tp i 
chlqvlg.ve 
arlhvhg.ag 
grlhtlg.vp 
arlqvrg.vd 
gtlwahg.ad 
atlhthg.tg 
aelfva6.ta 
aelhvrg.vg 
aalhtdg.qp 
arlhthgaaa 
aelfvrg.va v 
aqayvrg.ad v 
atayahgv-. . 



TWHPHHY 
TWHPHHY 
> TWHPHHY 
.TWHPHHY 
.TWHPHHY 
.EW.SAVL 
^EW.SAVL 
,DW.PALL 
.DW.PALL 
.DW.TALL 
.DW.TALL 
.DW.PEVI 
.DW.PTVL 
.DW.SRIL 



.DL.AAVL 
IDW.AKVL 
.DW.SVLF 
.DW.SVLF 
.DW.SAYF 
.SW.PGVF 
.SW.PGVF 
.TW.SGVF 
.SW.AGLF 
.DW.KGVF 
.DW.AAPY 
.DW.AGMH 
.DW.SAFF 
.DW.NAFF 
.DW.TGCF 
.DW.SGYF 
.DW.SAAL 
.DW.TAVL 
.RW.AGLF 
.DW.TAFY 
.DW.PVLF 
.DW.QDFF 
.DW.AAFF 
, .DW.AALL 
, .DW.SATP 
, .RW.DAYF 
. .DW.PAFY 
. .DW-AAYL 
. .DW^DAVF 
. .SW.PAYF 
. .EW.AGVF 
. .DW.TTVL 
. .DL.TALF 
. .NW.PAAL 
. .DW.PALL 
. .DF.TRAY 
PLRL 



Flg2t 
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ataveOOx 
atdebsOOp 
atepo06p 
atepo07p 
atepoOlp 
atepoOSp 
atsoralx 
atfkbOlp 
atfkbOSp 
atrap03p 
atrap06p 
atrap04p 
atrapl3p 
atrapOlp 
atrap07p 
atraplOp 
atfkb04x 
attyl04p 
attylOSp 
attylOlp 
attyl02p 
attylOOp 
atnidOSb 
attylOSb 
atnid06x 
atdebsOlp 
atmon02p 
atmonlOp 
atmon04p 
atiaon07p 
atmonllp 
atiaonl2p 
atmonOSb 
atmonOlp 
atdebs02p 
atdebs06p 
ataveOlp 
atave07p 
ataveOSp 
atave09p 
atnysOlp 
atnysllp 
atrifOSp 
atrif07p 
atrifOBp 
atriflOp 
atrif03p 
atrif06p 
atri£04p 
atrifOlp 
atnys02p 
atfkb02p 
atavellp 
atdebsOSp 
atnid04p 
atdebs05p 
atdebs04p 
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TQQQSLLLDL 
GSLPE.FAPA 
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P.H 

P.D 

TTT 

ATT 

TTA 

TTT 

.TAA 

ATT 

PAT 

AASVTAHDTG 
. . . DPALPPG 
. . . DPALPPG 
.DPALPPG 
...DPALPPG 
. . . DPALPPG 

.G PGA 

.G PCS 

EG TGA 

LG TGA 

. . PADPAP 
. . PADPTP 
. .PADPTP 
. . PADPTP 
. . PADPTP 
. . PAVPLP 
. .PADPAP 
. .PATGT. 

GRA 

P AA 

THHHTHPHPH 
THHHTHPHNH 
TQTHPHPNPH 
TQTHPHPHMH 

E GTGA 

E GTGA 

PPA 

PPA 

PPA 

PPA 

PAA 

PRT 

PPS 

PAA 

A RSAY 

P FGEL 

GSVGRGVAGG 

PGA 

PRA 

ADA 

. ..RPAVAGG 



VRAHTMAVLN 
EBEDEPAESG 
RRVPLPTYPW 
RRVPLPTYPW 
RRVPLPTYPW 
RRVPLPTYPW 
ESMPLPSTAG 
RVLDLPTYPF 
RLLDLPTYPF 
RVLDLPTYAF 
RVLDLPTYAF 
RVLDLPTYAF 
RVLDLPTYAF 
RVLDLPTYAF 
RVPDLPTYAF 
RVLDLPTYAF 
TAHDLPTYAF 
HLTTLPTYPF 
HLTTLPTYPF 
HLTTLPTYPF 
HLTTLPTYPF 
HLTTLPTYPF 
RPVELPTYPF 
RRVNPPTYPF 
RRVPLPSYAF 
RRVPLPTYPF 
RTIDLPTYAF 
RTVDLPTYAF 
RTVDLPTYAF 
RTIDLPTYAF 
RTVDLPTYAF 
RWDLPTYAF 
RTVDLPTYAF 
STVELPTYAF 
GLVDLPGYPF 
PPVALPNYPF 
THLDLPTYPF 
-HLDLPTYPF 
THLDLPTYPF 
.HLDLPTYPF 
ARVDLPTYAF 
SRIDLPTYAF 
.RVELPTYAF 
. RVDLPTYAF 
.RVDLPTYAF 
. RADLPTYAF 
GWVDLPTYAF 
GRVDLPKYAF 
RRVELPTYAF 
GWVDLPTYAF 
QPVDLPTYPF 
RGVPLPTYPF 
CGVELPTYAF 
APFALPTYPF 
GWVDLPTYAF 
RPVELPVYPF 
RPAELPTYPF 



DDGN 

VDWNAPPHVL 
QHERCWIEVE 
QHERYWIEDS 
QRERYWIEAP 
QRERYWVDAP 



RER 

PDARR 

VHGSKPSLRL RQLRNGATDH 
AKSAAGDRRG VRAGGHPLLG 
TGGAAGGSRF AHAGSHPLL-- 



DHKRYWLQPA 
DHKRYWIEAT 
QHQRYWVE. . 
QHQRYWLR. . 
QHQRYWVK. . 
QHQRYWLK. . 
QHQRYWLK. . 
QHQRFWAE. . 
QHQRYWAEAG 
HHERYWIEPA 
NHHHYWLDTT 
NHHHYWLDTT 
NHHHYWLDTT 
NHHHYWAVTS 
NHHHYWLDTI 
QRERYWCHP. 
QRERYWYHPT 
HRDRFWLPTA 
QRERVWLEPK 
QRRRYWLADT 
QHQHYWLERS 
QHQHYWLEEP 
QRRSYWL..P 
QHKHYWVEPP 
QRERFWLEGR 
QRQDFWPAPA 
QRRRYWLEAP 
QGKRFWLLPD 
EPQRYWLAPE 
QHQHYWLESS 
QRQHYWLD.A 
QHQHYWLQPP 
QHQHYWLQ™ 
QRERYW.NTR 
QHEHLW.AVP 
DHQHFW. 
DHQHFW. 
DHQHYW. 
DHEHYW. 
DRRHFW. 



PVT 

GAADLTALGL 
.GVDRSAAG. 
.SVDRAAAD. 
.SVDRAAAD. 
.SVDRAAAD. 
.GVDRAAAD. 
.GADRSVAG. 
RSADVSAAGL 
TGTDASGLGL 
PTTPA.TTTQ 
PTTPA.TTTQ 
PTTPA.TTTQ 
PAGVG.DAA. 
DGGGGDDATQ 
GVRGGDPASL 
SGRRGDITAA 
AARRPATSSS 
PVARRSTEVD 
VKRDSGWDPA 
ASASGAVSGE 
SGLTGDAADL 
VDGVGDVRSA 
AAVAAVGGGH 
RGLAGDPAGL 
GGRSGDPAGL 
TG.TQDAAGL 
RTTPRDEL.D 
VS . . - DQLAD 
QPGAGSGSG- 
PTGAGDV— 
TTTTDLTTTG 



TDTAHP 

. GHPLLGV 
• GHPLLGT 
.GHPLLGA 
.GHPLLGT 
.GHPLLGT 
. GHPLLGV 
DAVGHPLLGA 

D— — 

SPTDAQNPAD 
SPTDAWR. . . 
SPTDAWR. . - 

a « . a j^Gll^ a « a 

EKESGPLTRE 
GMDGADHPLL 
GVAEAEHPLL 

EV — 

GS 

QSA 

GMVA 

GLRRVE 

DPVEA 

GL 

GLAASGHP— 

GL 

GWF 

SRYRVD 



LTPTHHPL- 



.LS 
.LR 
.LR 
.LR 
.LH 



DHRHYW. aLR 
DHQHYW a .LQ 
EHRHYW. .LE 
QRQDFWPEAR 
RRDRYWVDAE 
ERERFWLDVE 
QRKRYWLQPA 
QRERYWVAPA 
QRQRYWLPXP 
EHQRFWPRPH 



TAADRTPADA 
PAPEAVAAAD 
PPAVA.DAPA 
PAAQA.DAVS 
YVETATDAA- 
AADTASDAVS 
EAETAEAMG 
PAESATDAAS 
MG6SATDAV~ 
PAEPASA6DP 
PATPAAGADA 
PAGTSGHP— 
GAPRGSGVSG 
APAAASDELA 
EPGPAAGAGS 
TGGRARDEDD 
RPADVSALGV 



PMDAEFWA— 
PDDAAFWTAV 
LGLAGADHPL 

LGQAAAEHPL 

LGLAGADHPL 

M— — 

LGQGAADHPL 

LLGT 

SD 

QW 

YRV-— 
AAATGPAAA- 

DWR 

R 
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atave02a 

ataveOSa 

atave04a 

ataveOSa 

ataveOSa 

atrap02a 

atraplla 

atrapOSa 

atrapl2a 

atrap05a 

atrap09a 

atfkb03a 

atfkbOTx 

atfkbOSx 

atnidOla 

atnid03a 

atnid02a 

atnidOOa 

atfkblOa 

atrapl4a 

atmon06a 

atmonOSa 

atmonOSa 

atepo02a 

atepoOSx 

atepoOSa 

atepoOOa 

atepo04a 

atnid07a 

attyl07a 

at5or02a 

atsorbla 

atnys09a 

atnysl2a 

atnysl6a 

atnysl7a 

atnysOSa 

atnyslSa 

atnysOTa 

atnysOSa 

atnysOSa 

atnysOSa 

atnys04a 

atny3l4a 

atnysOOa 

atnyslOa 

atnyslSa 

atnyslSa 

atavGlOa 

atrif02a 

atmon03a 

atavel2a 

atrif09a 

atmonOOa 

attylOSa 



THHDNQPHTH THLDLPTYPF QHHHYWLE. . STQPGAGNV 

THHHNQPHTH THLDLPTYPF QHHHYWLELP SAQTSPGQRR SRRSAPD 



THHHNQPHTH THLDLPTYPF QHQHYWLE.. 
THHHNQPHTH THLDLPTYPF QHHHYWLE,, 
THHHNQPHTH THLDLPTYPF QHHHYWLQ. < 



GDVPVTRV. 
GDVPVTRV. 
GDAPATRV. 
GDAPATRV. 
GDAPATRV. 
GDVPVTRV. 
GAAP.TDL. 
•GSDRAPV. 
GGASRHDP 



.LDLPTYAF QHQRYWLE. 
.LDLPTYAF QHQRYWLE. 
.LDLPTYAF QHQRYWLE. 
.LDLPTYAF QHQRYWLE. 
, .LDLPTYAF QHQRYWLE. 



STQPGAGSGS GSGSGRAG-" 

STQPGAGNVS AA ~ — 

. .PPGKPSDP SP' 

. .GHPLLGS 
..GHPLLGS 
, .GHPLLGE 
..GHPLLGS 
. .GHPLLGP 



.GTDRATAG. 
.GTDRATAG. 
• GTDPMAAG. 
.GTDRATAG. 
.GADRAAAG. 

LDLPTYAF QQQRYWAEVG RSADVSGAGL DAVGHPLLGA 

PHLPTYPF ERTRYWLGSR AAGDA — 

.ALPTYPF QHKDYWLRAT AQVDVTGAGQ EKVAHPLL-- 

.DVPSYAF QRRPYWIE.S APPATADSG HPVLGT 

.LHTTSPQT HHLDLPTYPF QRDRYWM.EP VRVAQVSGQP GADRLRYRW 

..LHTTSPQS HHLDLPTYPF QRDRYWM.AV PPRAAVGDLA 

..PHPSHIPA QRVSLPAYPF QRRAYWM..P NSAAHIGRSD AEAATRLGLA 
..VLCGASRP RRVELPTYAF QRRTHWAPGL TPNHAPADRP AAEPQRAMAV 

A GG RPVDLPVYPF QHRSYWLAPA VGGGSPTAVP D— 

S GG RAVDLPVYPF QHQSYWLAPA . .APDATAVA PWEEEGGEY 

AGDRVPGL.. ..VELPTYAF QRERFWLSG. RSGGGDAATL GLVAAG 

AGDRVPGL.- . .VELPTYAF QRERFWLSG. RSGGGDAATL GLVAAGHPL- 

PDDPAPRT.. ..VDLPTYAF QGRRFWLADI AAPEAVSSTD GEEA 

PTAG RRVPLPTYPW QRQRYWIEAP AE 

PTAG RRVPLPTYPW QRQRYWPDIE PDSRR.HAAA DPTQGWFY— 

PSGG RRVPLPTYPW QRERYWIEAP VDREA.DGTG — 

PSGG RRVPLPTYPW QRERYWIDTK ADDAA.RGDR RAPGAGHDEV 

PDGA RRVALPMYPW QRERHWMDLT PRSAA.PAGI AGRWPLAGVG 

...EG..AGA RRVDLPTYPF QHTRYWL— 

A. .DGPEGPA RRVELPVHAF RHRRYWLAPG RAA 

A PFAP R 

A PFAP CKVPLPTYTF 

AGT GA RRTDLPTYAF QRRRYWPKAL Q5GTA.DLRS VGLGAA 

ADH GA RRTTLPTYAF QRERYWPDTT AATSA.HTPG SALDAEFW— 

TGT GA RGTDLPTYAF QRERYWPE.. LAAEP.AG, . GGADAADA— 

AGT GG RRIALPTYAF QRERYWPS . . LAAQA.PGDA GG 

DGT GA RRADLPTYPF QHQRFWPT. . AAR.A.AQDV TAAGLGAADH 

AGT GA HRTDLPTYAF QYERYWPK. 

AGV GA GRVELPTYAF QRGWFWPV. 

AGV GA GRVELPTYAF QRGWFWPV. 

AGS. . . . .GA TRVDLPTYAF QHATYWPT. 

APT GA RPVDLPTYAF QHRPFWPS. 

RGL DP VRVDLPTYAF QHRWFWPA. 

AGR GA QWLDLPTYPF QRGRFWPE. 

AGT GA RRVELPTYAF QHVRHWPT. 

AGT GA RRTDLPTYAF QHAYYWPQ. 

AGT..RTPQA DPVBLPTYAF QRARYWPTLG ARHGD.PADL G— 

EAT GG HRTDLPTYAF QRERYWPELG APVAT.APQD PAAW— 

EGTAREVGDG CGVBLPTYAF ERERFWLDVB EGSAG.GSGV SGMWGGPLWE 

DEPATA VGTVLPTYAF QHQRFWVDVD ET 

PAD A GQVPLPTYRF QRRRYWRVAP DAAAP.ARAA GLQ 

PERDR A RHLDLPTYAF DHHRYWVDTS AGHPG.DLSA AGLGT 

PPVTGF VDLPKYAF DQQHYWLQPA AQATD.AASL GQV ' 

GAT AT RRFPLPTYPF QRERHWPAAA GVGQQ.PETP ELP 

APAPDAASLA VAAELPTYAF QRTHYWLDAP AAPAALPAGL DDAGHPLLSA 
**** 



ATY.R.PADA TGL 

GRVGV.GGDV 

GRVGV.GGDV GAVGLGSAGH 

GTLPT.,AHA AAVGL 

GPRDT..ADA AAVGIAGASH 
ARPAR.PDDV RAAGLGAA— 
SLPGA.ASAA PAAGQPA--"- 
PPRPN.GAGP GALGHPLLG- 
LPTPA.AALA AADPADQQLW 
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